Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] improvements to duckDB column type handling #2970

Merged
merged 28 commits into from
Feb 18, 2025

Conversation

igorDykhta
Copy link
Collaborator

@igorDykhta igorDykhta commented Feb 2, 2025

  • this PR intends to preserve column types between different types of ingestion into Kepler and DuckDb

Goals:

  • timestamps stored as strings from Arrow tables are recognized as timestamps.
  • apply extra metadata from table.schema.metadata (geoparquet files).
  • DuckDB geometry is automatically casted to WKB, and properly marked with geoarrow extensions.
  • DuckDB column types and query result Arrow table types consolidation.
  • Apply extra logic only to the last select query.
- geoarrow constants to constants module
- add getSampleForTypeAnalyzeArrow to support and not fail for arrow data
- arrowSchemaToFields accepts extra info from DuckDB table schemas. JSON type gets GEOMETRY_FROM_STRING type, GEOMETRY with geoarrow metadata gets GEOMETRY type, timestamp ...
- fix in validateInputData - check analyzerType only for current field
- fix in validateInputData - support arrow input data

@igorDykhta igorDykhta requested a review from ilyabo February 2, 2025 22:55
Copy link

netlify bot commented Feb 2, 2025

Deploy Preview for kepler-duckdb ready!

Name Link
🔨 Latest commit 3860fbd
🔍 Latest deploy log https://app.netlify.com/sites/kepler-duckdb/deploys/67b4aaf40535820008d735f7
😎 Deploy Preview https://deploy-preview-2970--kepler-duckdb.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@igorDykhta igorDykhta self-assigned this Feb 6, 2025
@igorDykhta igorDykhta marked this pull request as ready for review February 18, 2025 03:31
Signed-off-by: Ihor Dykhta <[email protected]>
Signed-off-by: Ihor Dykhta <[email protected]>
@igorDykhta igorDykhta changed the base branch from igr/duckdb-demo-branch to master February 18, 2025 15:52
Signed-off-by: Ihor Dykhta <[email protected]>
@igorDykhta igorDykhta merged commit 221b243 into master Feb 18, 2025
8 checks passed
@igorDykhta igorDykhta deleted the igr/improvements-to-duckdb-plugin-types branch February 18, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants