Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix upload/delete transactions in Snowflake backend #10738

Merged
merged 34 commits into from
Aug 6, 2024

Conversation

radeusgd
Copy link
Member

@radeusgd radeusgd commented Aug 2, 2024

Pull Request Description

Fixes #10609 by rewriting all our upload-related operations to rely on DDL_Transaction - an abstraction that handles 'transactionality' of CREATE TABLE statements dependent on if a given backend allows DDLs within transactions or not (if not it emulates transactionality by creating the tables outside of transaction and then dropping them on rollback).

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

@radeusgd radeusgd added the CI: No changelog needed Do not require a changelog entry for this PR. label Aug 2, 2024
@radeusgd radeusgd self-assigned this Aug 2, 2024
Comment on lines +27 to +38
source = Delete_Rows_Dry_Run_Source.prepare connection key_values_to_delete key_columns
source.run_in_transaction connection source_db_table_uploader-> dry_run_message_suffix->
## We check NULL values in source table (key_values_to_delete) and only if that passes we
perform the actual upload into the temporary table.
Both check and upload are done in a transaction, so we can be sure there are no external changes to the table visible.
The check is needed as otherwise the upload would fail due to NULL constraint violations.
check_for_null_keys key_values_to_delete key_columns <|
source_db_table = source_db_table_uploader.materialize
check_duplicate_key_matches_for_delete target_table source_db_table key_columns allow_duplicate_matches <|
affected_row_count = target_table.join source_db_table on=key_columns join_kind=Join_Kind.Inner . row_count
warning = Dry_Run_Operation.Warning "Only a dry run of `delete_rows` was performed - the target table has not been changed. Press the Write button ▶ to update the actual table."+dry_run_message_suffix
Warning.attach warning affected_row_count
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: maybe we could actually retro-actively catch the NULL constraint violation instead of a precondition check?

That would save us one additional query.

But that would not work in dry-run of DB table as that mode does not do any upload at all. So it would require separate logic. May be worth considering, but I'd avoid this in this PR as that will mix things too much - right now I'm just porting the existing check logic to the new structure.

If we want, I could create a ticket about optimizing it. But I feel like this is likely good enough anyway.

(Btw. in-memory it is actually better to do the check locally, before upload, as that is going to be faster - it is only ever cheaper to do this check on upload when dealing with a query in non-dry-run mode).

@radeusgd radeusgd marked this pull request as ready for review August 2, 2024 16:36
@radeusgd radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Aug 6, 2024
@mergify mergify bot merged commit 3fd1464 into develop Aug 6, 2024
36 of 37 checks passed
@mergify mergify bot deleted the wip/radeusgd/10609-snowflake-transactions branch August 6, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement transactional Table Upload / Update / Delete Rows for Snowflake backend
3 participants