Jake Kara [email protected] Dec. 12, 2016
Data here (https://www.fhwa.dot.gov/policyinformation/travel_monitoring/tvt.cfm) is only available in monthly spreadsheets. There are no annual figures at the state level. I even called the Federal Highway Administration's PR room and they said the data is not available annualized at the state level, so this scrape was necessary.
If you just want national data, that's in the "historical" spreadsheet (https://www.fhwa.dot.gov/policyinformation/travel_monitoring/historicvmt.xlsx) going back to 1970!
- Download all XLS/XLSX files from above url using requests and BeautifulSoup (/scrape/vmtscrape.py) 2. Parse files in reverse chronological order until the format changes too significantly to be interested in going back any further.
I was able to parse data from 2004 with only three blank months in 2007, which I could have manually fixed by looking at the spreadsheets, but my colleague only needed data back through 2009 anyway, so this was enough.
To customize this script for your state, just replace "Connecticut" in the process_2016() function.
The process_2016() function processes all spreadsheets that are in the same format as the 2016 files, or are minimally different. I put in a few conditions to handle the files back to 2004.
For earlier years, I would recommend writing a process_2003() method, to process 2003 data and as far back as possible until you hit hickups.
-
/scrape - files related to scraping the FHWA site for spreadsheets
-
/scrape/data - scraped data from FHWA site. Contains ALL excel files linked to on that page (as of Dec. 12, 2016)
-
State level vehicle miles traveled per DOT.ipynb - Python notebook to stitch data together.
-
/output - files generated by State level vehicle miles traveled per DOT.ipynb