This section presents the various modal data used in the preparation of the freight mode choice dataset.
To obtain the distances and transit times by truck for each shipment in the CFS data, the team obtained the HERE data, which was generously provided by the Caliper corporation. The HERE data contain layers with distances and transit times for the entire US highway and street network, along with additional geographic information such as ZIP codes, census tracts, counties, cities and states (HERE 2018). The final maps from the HERE data were processed by the Caliper corporation and provided to the RPI team in GIS format. The team post-processed the HERE data to obtain distances and transit times between all ZIP codes in the entire US.
Truck distances and transit times were calculated assuming each shipment by truck follows the shortest path from origin ZIP code to destination ZIP code. The output included origin ZIP code, destination ZIP code, truck transit time (in minutes), truck distance (in miles). Truck rates were calculated based on an updated version of the model presented in Holguín-Veras and Brom (2008), which estimates truck direct cost in dollars for a shipment as a function of the distance and time travelled. The finalized truck distances, transit times and costs were merged with the CFS-LBD dataset by matching the ZIP-ZIP OD pairs between the CFS-LBD data and the transit time and distance matrices.
Thanks to the assistance of the Federal Railroad Administration (FRA), the team secured access to the confidential 2012 Waybill Sample, a stratified sample of carload waybills collected by the Surface Transportation Board (STB). This database contains information on commodity type in Standard Transportation Commodity Code (STCC) codes, shipment size, types of car used, origin-destination information ranging from country and state, up to ZIP code, Freight Station Accounting Codes (FSAC), number of cars, revenue, shipment rate, variable rates, distance (routed, shortest path), and number of transfers.
In addition, the team secured the Federal Railroad Administration (FRA) rail network data, which contain geo-coded information about rail nodes, rail stations (FSAC), and link distances. The network contains information on privately owned freight rail lines in the actual freight network at county and city levels, rail lines are presented with labels that include information on the primary owner of each particular freight rail line. The network data were updated in 2010. The team used these data to produce rail distances between all ZIP codes (nearly 40,000) in the US, assuming the rail takes only the shortest paths, and a drayage by road to the closest rail station. The transit times were estimated for various commodities assuming an average waiting time of 24 hours for each transfer. These rates, distances, and transit times by rail between all ZIP codes in the US are incorporated in the freight mode choice modeling process. Since the commodity types are defined differently by CFS (use SCTG), and Waybill (STCC), a conversion matrix between SCTG and STCC was used to obtain the rail distances, transit times, and rates for each shipment in the CFS data.