-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix upload/delete transactions in Snowflake backend #10738
Conversation
…ion abstraction with table cleanup
source = Delete_Rows_Dry_Run_Source.prepare connection key_values_to_delete key_columns | ||
source.run_in_transaction connection source_db_table_uploader-> dry_run_message_suffix-> | ||
## We check NULL values in source table (key_values_to_delete) and only if that passes we | ||
perform the actual upload into the temporary table. | ||
Both check and upload are done in a transaction, so we can be sure there are no external changes to the table visible. | ||
The check is needed as otherwise the upload would fail due to NULL constraint violations. | ||
check_for_null_keys key_values_to_delete key_columns <| | ||
source_db_table = source_db_table_uploader.materialize | ||
check_duplicate_key_matches_for_delete target_table source_db_table key_columns allow_duplicate_matches <| | ||
affected_row_count = target_table.join source_db_table on=key_columns join_kind=Join_Kind.Inner . row_count | ||
warning = Dry_Run_Operation.Warning "Only a dry run of `delete_rows` was performed - the target table has not been changed. Press the Write button ▶ to update the actual table."+dry_run_message_suffix | ||
Warning.attach warning affected_row_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: maybe we could actually retro-actively catch the NULL constraint violation instead of a precondition check?
That would save us one additional query.
But that would not work in dry-run of DB table as that mode does not do any upload at all. So it would require separate logic. May be worth considering, but I'd avoid this in this PR as that will mix things too much - right now I'm just porting the existing check logic to the new structure.
If we want, I could create a ticket about optimizing it. But I feel like this is likely good enough anyway.
(Btw. in-memory it is actually better to do the check locally, before upload, as that is going to be faster - it is only ever cheaper to do this check on upload when dealing with a query in non-dry-run mode).
Pull Request Description
Fixes #10609 by rewriting all our upload-related operations to rely on
DDL_Transaction
- an abstraction that handles 'transactionality' ofCREATE TABLE
statements dependent on if a given backend allows DDLs within transactions or not (if not it emulates transactionality by creating the tables outside of transaction and then dropping them on rollback).Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.