Intro to the FAF5 Database
The freight analysis framework (FAF5) is maintained by the U.S. Department of Transportation's (DOT's) Bureau of Transportation Statistics (BTS). It's a comprehensive database of freight movement data for a range of different commodity types transported by trucking, rail and inland shipping in the U.S. It also projects freight flows out to 2050.
How I Found the Data
I first came across the FAF5 data while looking through geospatial datasets maintained by the BTS at geodata.bts.gov, where I found shapefiles for the highway network links, geographical regions, and high used by the FAF5 framework. This led me to the FAF5 website, which not only has all the FAF5 freight flows, but also some great resources on how freight data is collected and processed to produce the final datasets and projections.
FAF5 Data Formats
FAF5 publishes commodity flow data in two formats: origin-destination (OD) flows and network flows. The origin-destination flows quantify the amount of each commodity type (or sum over all commodity types) that flows between each of the FAF5 regions in the U.S ( there are typically several FAF5 regions per state). The network flows (which are only provided for trucking) quantify the amount of each type commodity that flows by truck along each individual highway link (typically a few km long) in the highway network links. This is done by associating an ID to each network link, and providing a CSV file that quantifies the estimated annual flow (tons or value) of each commodity The latter gives a more localized picture of the freight movements, but doesn't capture where the flows are originating from and destined for.
Working with the FAF5 Data
In the long-term, we're hoping to be able to combine the FAF5 freight flow data with estimates of lifecycle emissions associated with the transportation of these flows (using LCA tools like GREET or SESAME) to build up a geospatial picture of emissions associated with freight flows in the U.S., ultimately adding in domestic flights and shipping.
As a first step in this direction, we decided to build up a robust framework to visualize the freight flows themselves, with the idea of folding in lifecycle emissions estimates into this framework to ultimately visualize the emissions associated with the freight flows. The DOT has published plots of daily truck volume flows in the U.S. The below example shows their visualization of truck volumes for all commodities.
I decided to try reproducing this visualization in an open-source GIS mapping tool called QGIS, using the shapefile with geospatial network links and the CSV file from the FAF5 website with highway network assignments. The main challenge was figuring out how to join the shapefile with the relevant info in the CSV file to weight the network links with the associated commodity flow from the CSV file. This is accomplished using QGIS's table join functionality. After performing this join, the displayed width of each highway network link can be weighted by the volume of commodity flow from the CSV file. The resulting visualization of commodity flows is shown below:
For the moment, I'm visualizing total commodity flows in tons, but this could be trivially extended to visualize individual commodities, or to weight by value of goods transported rather than tons. I'm also restricting the visualization to Texas for the moment to keep things tractable, but in principle this could be extended to the rest of the U.S. The different-coloured regions in the figure represent the FAF5 regions used for origin-destination flows.
After doing this initial analysis with QGIS's graphical user interface (GUI), I encoded the analysis for reproducibility using QGIS's python API. The code and usage instructions are documented in this GitHub repo: https://github.com/danikam/FAF5_Analysis.
Reflections and Next Steps
Overall, it's been encouraging to see that the FAF5 data can be pretty readily visualized in QGIS, and the python API allows for the data analysis to be fully encoded. We have an awesome undergraduate researcher with MIT's UROP program named Micah Borrero who's going to work with the code and extend its functionality. He's already suggested that it could be helpful to shift some of the backend analysis tasks to a native python module called GeoPandas, and I'm super excited to see where that goes.
Ultimately it will probably be most valuable to visualize lifecycle emissions in terms of origin-destination flows rather than local network links, since the origin-destination demand is what's driving the flows. So an immediate next step will be to think about how best to visualize the origin-destination data using the FAF5 region shapefiles. One possibility could be to add up all of the flows to/from all destinations for each FAF5 region, perhaps weighting them by the distance between the origin and destination.
In the meantime, I'm looking into the GREET and SESAME LCA tools to start getting a feel for how we could fold calculations of lifecycle emissions associated with transporting commodities into the visualization of freight flows.
I've also started to collaborate with my colleagues Sydney and Abigail at the MCSC to look into adding the volume-weighted FAF5 network flows to an interactive resilience mapping tool that they've developed, which visualizes datasets of interest for resilience planning in the context of the energy transition (eg. site selection for carbon capture projects).
Comments