This section describes the mathematical formulations used to specify the freight mode choice models. The discussion starts with the shipment size models, followed by the market-share and the shipment-level models.

**Shipment size models**

These shipment size model were estimated to solve the endogeneity problems caused by the inclusion of the shipment size in the estimation of discrete choice models for freight mode choice. Using the instrumental variable approach the shipment size models were used to estimate the shipment sizes for specific shipments, using them as independent variables in the discrete choice models, in place of the real shipment sizes. The models expressed shipment size as a function of the great circle distance (GCD) between shipment origin and destination in miles, which is reported in the CFS for each shipment. Different functional forms were tested and the best models for each type of commodity (or group of commodities) were selected. The functional forms that consistently provided the best results were the power and logarithmic functions, which are described next:

**Market-share mode choice models**

As indicated before, the market-share models express the share of a mode (of a typical shipment) as a function of the characteristics of the competing modes, this is achieved with the use of logistic functions, or the kind shown in Equation (16) that uses an utility function to account for the effects of the various independent variables. The utility functions used are shown for the generalized cost version in Equation (19) and for the version of the model that considers transit times and freight rates as independent variables ion Equations (18).

For models that use transit time and freight rates separately:

For models that use generalized cost:

As shown in Equation (16) and (17) *MS _{ti} *includes a constant

*β*, intended to capture any bias in favor (or against) the use of truck. If, in equality of conditions shippers prefer trucking,

_{0i}*β*is expected to be positive. The other parameters,

_{0i }*β*

_{Ci}*and*

_{ , }β_{Ti}*β*must be negative, as increases in time, rate or generalized costs of a particular mode is bound to reduce the use of a mode. The market shares

_{GCi}*MS*and

_{ti}*MS*are the market share by shipments. In essence, the market-share models estimate the probability of a typical shipment choosing between truck and rail.

_{ri}Market-share models can be readily estimated with Ordinary Least Squared (OLS) techniques–typically referred to as regression analyses. However, in order to do so, the original function must be linearized. The first step is to compute the ratio of the *MS _{ti}* and

*MS*, for

_{ri}*MS*> 0, leads to:

_{ri}For models that use transit time and freight rates separately:

For models that use generalized cost:

Taking natural logarithms, the formulation used in the transit time and freight rate model becomes:

In the case of the generalized cost models the linearization leads to:

As Equations (22) and (23) show, show, since *β _{C}*

*and*

_{ , }β_{T},*β*are expected to be negative, the higher the transit time, rate or generalized cost of truck, the lower its market share; and conversely, the larger the market share of rail. Similarly, increases in transit time, rate or generalized cost of rail will increase the market share of truck. The estimation of the market-share models required post-processing the CFS data, to aggregate the individual shipments into distance bins, using the GCD reported in CFS for each shipment. The distance bins were defined starting from 5 miles. Shipments with distances below 5 miles were removed from the database, since they likely represent either trucking captive shipments or data errors. Therefore, the first bin captured those shipments ranging from 5 – 25 miles. The following bins were divided by increments of 25 miles starting from 26 to 1000 miles, (26-50, 50-75, 75-100,…, 1000). From 1000 miles on, bins were defined as increments of 50 miles up to 1400 miles (1001-1050, 1051-1100,…, 1400). Shipments with trip lengths above 1400 miles were included in a single bin of distances greater or equal to 1400 miles.

_{GC}**Shipment-Level Mode Choice Models**

The estimation of shipment-level freight mode choice models was conducted with discrete choice models. A unique feature of the discrete choice models of freight mode choice is that they must take into account the econometric interactions between the continuous choice of shipment size and the discrete choice of freight (or vehicle) mode. To this effect, the team used the shipment size models to obtain estimates of the actual shipment sizes to compute the values of transit times and freight rates by truck and rail. In the case of models based on generalized costs, the freight rates and transit times were combined using the intrinsic cargo value. The basic specifications of the shipment-level freight mode choice models are described next.

Pooled models

These models consider the effect of the commodity type using a set of binary variables and the interaction of these binary variables with modal attributes (i.e., transit times, rate, and generalized costs) in the utility functions. In essence, the pooled models capture the commodity specific behavior in a single model, instead of separate (2-digit SCTG) models for each commodity.

For models that use transit time and freight rates separately (the subscripts for commodity type *i* and shipment *m* have been dropped for simplicity):

2-digit SCTG models

These models consider the effect of commodities by estimating freight mode choice model for each commodity separately at the level of 2-digit SCTGs.** **The expressions considered for the 2-digit SCTG models, for each commodity are: