This section explains the data preparation process that was followed in estimating the freight mode choice models using the 2012 Commodity Flow Survey (CFS) confidential micro-data. In general terms, the estimation of shipment-level freight mode choice models requires data regarding: the shipment characteristics, e.g., shipment size and commodity type; the shipper attributes, e.g., establishment size and industry; and the model attributes of all modes to be analyzed, e.g., transit times and freight rates. A major challenge that had to be overcome is that CFS data do not include: (1) the shipper’s characteristics, such as location, industry sector, and employment; and (2) the attributes of the modes to be analyzed (truck and rail), such as distance, transit time, cost, number of transfers, and drayage distance; which are required to estimate freight mode choice models. As a result, the team had to find alternative ways to obtain the missing data about shippers’ characteristics and modal attributes to complement the CFS data. The datasets used are: Commodity Flow Survey (CFS), Longitudinal Business Database (LBD), Waybill, HERE data, and rail network data from Federal Railroad Administration (FRA). It should be noted that, since the CFS only have a minuscule number of observations for water and air freight modes, these modes could not be included in the freight mode choice estimation process. The modal attribute datasets are confidential and owned by various agencies.
Table 18 shows the data assembled for the freight mode choice analysis, and the datasets from which the respective variables were derived, and the sources that provide the data.
In summary, the shipment data are obtained from 2012 confidential CFS micro-data; the shippers’ attribute data are obtained from the 2012 Longitudinal Business Database (LBD) data from the Center for Economic Studies (CES); the truck attribute data are derived based on HERE data; and the rail attribute data are derived from the 2012 Waybill data and rail network data from FRA. To obtain modal attributes (costs, transit times) for each shipment in the CFS micro-data, statistical inference techniques were used based on the HERE and Waybill datasets. The sources of the data, relevant variables, challenges, and limitations are explained in the latter sections.