Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add connector for Parseable #32052

Merged
merged 8 commits into from
Jan 31, 2025
Merged

feat: add connector for Parseable #32052

merged 8 commits into from
Jan 31, 2025

Conversation

AdheipSingh
Copy link
Contributor

@AdheipSingh AdheipSingh commented Jan 30, 2025

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Screenshot 2025-01-31 at 3 23 50 AM --------------------------------------------------------------------------- Screenshot 2025-01-31 at 3 25 01 AM

TESTING INSTRUCTIONS

  • Manually tested with demo.parseable.com.
  • Pytest

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@dosubot dosubot bot added data:connect Namespace | Anything related to db connections / integrations enhancement:db Suggest new DB connections labels Jan 30, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Functionality Missing NULL Handling in Time Expressions ▹ view
Files scanned
File Path Reviewed
superset/db_engine_specs/parseable.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Naming
Database Operations
Documentation
Logging
Error Handling
Systems and Environment
Objects and Data Structures
Readability and Maintainability
Asynchronous Processing
Design Patterns
Third-Party Libraries
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

Comment on lines +22 to +32
_time_grain_expressions = {
None: "{col}",
TimeGrain.SECOND: "date_trunc('second', {col})",
TimeGrain.MINUTE: "date_trunc('minute', {col})",
TimeGrain.HOUR: "date_trunc('hour', {col})",
TimeGrain.DAY: "date_trunc('day', {col})",
TimeGrain.WEEK: "date_trunc('week', {col})",
TimeGrain.MONTH: "date_trunc('month', {col})",
TimeGrain.QUARTER: "date_trunc('quarter', {col})",
TimeGrain.YEAR: "date_trunc('year', {col})",
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing NULL Handling in Time Expressions category Functionality

Tell me more
What is the issue?

The time grain expressions don't account for possible NULL values in timestamp columns, which could cause queries to fail.

Why this matters

Queries may fail when processing NULL timestamp values, affecting data analysis and visualization reliability.

Suggested change ∙ Feature Preview

Add NULL handling to the time grain expressions:

_time_grain_expressions = {
    None: "{col}",
    TimeGrain.SECOND: "CASE WHEN {col} IS NULL THEN NULL ELSE date_trunc('second', {col}) END",
    TimeGrain.MINUTE: "CASE WHEN {col} IS NULL THEN NULL ELSE date_trunc('minute', {col}) END",
    # ... Apply similar pattern to other time grains
}

Report a problem with this comment

💬 Chat with Korbit by mentioning @korbit-ai.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

Copy link

codecov bot commented Jan 31, 2025

Codecov Report

Attention: Patch coverage is 90.90909% with 3 lines in your changes missing coverage. Please review.

Project coverage is 83.46%. Comparing base (b12f515) to head (726c9ed).
Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
superset/db_engine_specs/parseable.py 90.90% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #32052       +/-   ##
===========================================
+ Coverage        0   83.46%   +83.46%     
===========================================
  Files           0      545      +545     
  Lines           0    39036    +39036     
===========================================
+ Hits            0    32582    +32582     
- Misses          0     6454     +6454     
Flag Coverage Δ
hive 48.49% <63.63%> (?)
mysql 75.87% <63.63%> (?)
postgres 75.94% <63.63%> (?)
presto 53.02% <63.63%> (?)
python 83.46% <90.90%> (?)
sqlite 75.45% <63.63%> (?)
unit 61.00% <90.90%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A few observations:

  • Let's remove pinning to a specific version
  • There seems to be some linting issues, please check CI logs for the specific errors
  • Can you add a mention to the docs about this db? Docs are under /docs (you can check how other db types are represented there)

FYI we're likely cutting the 5.0 release next week, so if you can fix these issues, we can probably just squeeze this into this version.

pyproject.toml Outdated
@@ -156,6 +156,7 @@ ocient = [
"geojson",
]
oracle = ["cx-Oracle>8.0.0, <8.1"]
parseable = ["sqlalchemy-parseable==0.1.3"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not pin this, but rather define upper/lower bounds if needed.

@github-actions github-actions bot added the doc Namespace | Anything related to documentation label Jan 31, 2025
@AdheipSingh
Copy link
Contributor Author

@villebro i have addressed your review comments. Thanks for the head's up on the release cycle. Would love to get this in.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@villebro
Copy link
Member

@AdheipSingh you may want to install pre-commit hooks as instructed here: https://superset.apache.org/docs/contributing/development#git-hooks it will automatically fix formatting issues and catch typical type errors during git commit.

@AdheipSingh
Copy link
Contributor Author

Apologies, newbie to python. Will install the githook moving forward. I installed pre-commit locally to fix this.

Screenshot 2025-02-01 at 3 37 36 AM

@sadpandajoe
Copy link
Member

sadpandajoe commented Jan 31, 2025

@betodealmeida will probably know more, but do we need to add anything for dialect parsing here:

SQLGLOT_DIALECTS = {

@villebro
Copy link
Member

villebro commented Jan 31, 2025

@betodealmeida will probably know more, but do we need to add anything for dialect parsing here:

SQLGLOT_DIALECTS = {

@sadpandajoe I don't see Parseable here, so I assume we'll just fall back to the default dialect for now: https://github.com/tobymao/sqlglot/tree/main/sqlglot/dialects

@AdheipSingh
Copy link
Contributor Author

@betodealmeida will probably know more, but do we need to add anything for dialect parsing here:

SQLGLOT_DIALECTS = {

Interesting !
i actually followed this guide.
Let me know if we need to add it. Though in my testing locally i din't face any issue.

@villebro villebro merged commit 9e5876d into apache:master Jan 31, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect Namespace | Anything related to db connections / integrations doc Namespace | Anything related to documentation enhancement:db Suggest new DB connections size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants