-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source file: fix csv schema discovery #15870
Source file: fix csv schema discovery #15870
Conversation
/test connector=connectors/source-file
Build PassedTest summary info:
|
/publish connector=connectors/source-file
if you have connectors that successfully published but failed definition generation, follow step 4 here |
@davydov-d since this connector is used in cloud, we also need to publish the cloud version of this (i.e. with the |
/publish connector=connectors/source-file-secure if you have connectors that successfully published but failed definition generation, follow step 4 here if you have connectors that successfully published but failed definition generation, follow step 4 here |
Actually I had to also bump the source-file-secure version to match the source-file version, so I opened a separate PR to do that here: #15896 |
@lmossman thank you so much, sorry I missed that! |
No problem! It's really easy to miss - definitely a flaw in our current image setup that we will hopefully be addressing soon |
* #174 source file: fix csv schema discovery * #174 source file: upd changelog * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <[email protected]>
What
https://github.com/airbytehq/alpha-beta-issues/issues/174
When trying to discover the schema of a csv file, the connector iterates over dataframes and maps columns to its types. The problem is the type is overwritten every iteration, so the final schema is equivalent to the last dataframe types
How
Do not ignore dataframes. Do not narrow data types, make them only wider