-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NTD Ingestion – Remaining Tables in Dataset #3403
Comments
…djustments and Estimates Data [#3403]
External tables on branch 3403-ntd-api-external-tables |
…djustments and Estimates Data [#3403]
…djustments and Estimates Data [#3403]
…djustments and Estimates Data [#3403]
…djustments and Estimates Data [#3403]
|
Hey @erikamov, thanks for the update!
This is great, I say we get these tables merged as soon as we can and keep iterating as necessary
Hm, I'm assuming these column names aren't 'BigQuery safe'. If it will get things working, I say we just exclude these columns when listing the column names for the external table, since I don't think they're useful for analysis. If that doesn't work, then I say we hold off on creating external tables out of these files at the moment since these tables weren't actually requested, I just added them for some unrelated testing and decided to keep them in place. We can always reintroduce them with proper handling, if we learn they are desirable.
Let me know if I can hep in any way with this one! |
I created the a PR with the 2022 external tables and it is ready to review: #3465 I am still working on the XLSX tables. |
[#3403] Co-authored-by: Erika Pacheco <[email protected]>
…djustments and Estimates Data [#3403]
…djustments and Estimates Data [#3403]
2022 External Tables were successfully created on New PR created for NTD external tables for Complete Monthly Ridership with Adjustments and Estimates Data. The remaining tables will not be created for now. If there is a need for those I kept the work on a branch. Those table will need some extra work to be created due to position of the column names and invalid characters on the name (see previous comments). |
…djustments and Estimates Data [#3403]
Last PR merged. Next time the |
Tables created and new fields will be added on the next |
DAG updated all tables successfully. |
User story / feature request
Part of #3401
As a data engineer, I would like to ingest the data in the form of gcs blob storage and external tables for the remaining tables in the NTD dataset, extending the work found in
scrape_ntd.py
andannual_database_service.yml
(found below).Building upon the work completed in #3345
Existing NTD patterns:
General Cal-ITP Pipeline Patterns
Acceptance Criteria
I can successfully access the data for the following tables from the https://www.transit.dot.gov/ntd/data-product website in the warehouse as external tables:
Notes
The text was updated successfully, but these errors were encountered: