-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Source Postgres: support all Postgres 14 types #8726
Conversation
c64f2b4
to
5266017
Compare
5266017
to
2fb2000
Compare
/test connector=connectors/source-postgres
|
} else if (value.equalsIgnoreCase("-infinity")) { | ||
node.put(columnName, Double.NEGATIVE_INFINITY); | ||
} else if (value.equalsIgnoreCase("nan")) { | ||
node.put(columnName, Double.NaN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does json have a concept of infinity and Nan? how are these represented when output from the connector itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Actually Json numbers do not support NaN
or infinity. To support these three special values, we need to make sure that the destination can handle them. Otherwise, the destination will fail. Too bad, I need to revert this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like how we currently handle dates, you can output them in a string type with a format hint.
Then have normalization handle the special format hint for these special numbers accordingly for the destination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChristopheDuong, nice.
Where the code that handles this? I'd like to see what the format hint looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the json, it's "string" type.
Then, on normalization, if we find a format hint, we know it's actually a date not a string:
airbyte/airbyte-integrations/bases/base-normalization/normalization/transform_catalog/utils.py
Line 33 in c5fc568
and (definition["format"] == "date" or "date" in definition["format"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example of catalog.json with date:
Line 17 in c5fc568
"format": "date" |
so you can have a special string-float type instead of using json's number in order to encode special float values:
"db_float_column": {
"type": "string",
"format": "float"
}
and normalization can handle it in SQL: For example in bigquery, based on https://stackoverflow.com/a/53692265:
there is no literal representation of NaN or infinity, but the following case-insensitive strings can be explicitly cast to float:
"NaN"
"inf" or "+inf"
"-inf"
case
when db_float_column = '-infinity' then cast("-inf" as float)
when db_float_column = '+infinity' then cast("+inf" as float)
else cast(db_float_coumn as float) end
or differently for another destination
…ns with JDBCType.ARRAY (#8749) * support array for jdbc sources * fixed PR comments, added test cases * added more elements for test case * Fix test case * add array test case for JdbcSourceOperations Co-authored-by: Liren Tu <[email protected]>
Postgres source cannot handle these special values yet See https://github.com/airbytehq/airbyte/issues/8902
/test connector=connectors/source-postgres
|
This reverts commit 3bee7d1.
/test connector=connectors/source-postgres
|
/publish connector=connectors/source-postgres
|
* Add skeleton to support all postgres types * Consolidate type tests * Fix corner cases * Bump postgres version * Add tests for time and timetz * Format code * Revert date to timestamp * Update comment * Fix unit tests * 🐛 Jdbc sources: switch from "string" to "array" schema type for columns with JDBCType.ARRAY (airbytehq#8749) * support array for jdbc sources * fixed PR comments, added test cases * added more elements for test case * Fix test case * add array test case for JdbcSourceOperations Co-authored-by: Liren Tu <[email protected]> * Revert changes to support special number values Postgres source cannot handle these special values yet See https://github.com/airbytehq/airbyte/issues/8902 * Revert infinity and nan assertion in unit tests This reverts commit 3bee7d1. * Update documentation * Bump postgres source version in seed Co-authored-by: Yurii Bidiuk <[email protected]>
What
How
JDBCType
for most of the types. For corner cases, we check the column type name returned from JDBC, and treat them differently.Recommended reading order
PostgresSourceDatatypeTest.java
PostgresSourceOperations.java
🚨 User Impact 🚨
Pre-merge Checklist
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described here