-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: handle range keys in BACKUP #84875
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Tests that Backups without Revisions History and Restore properly handle | ||
# range keys | ||
|
||
new-server name=s1 | ||
---- | ||
|
||
exec-sql | ||
CREATE DATABASE orig; | ||
USE orig; | ||
CREATE TABLE foo (i INT PRIMARY KEY, s STRING); | ||
INSERT INTO foo VALUES (1, 'x'),(2,'y'); | ||
CREATE TABLE baz (i INT PRIMARY KEY, s STRING); | ||
INSERT INTO baz VALUES (11, 'xx'),(22,'yy'); | ||
---- | ||
|
||
# Ensure a full backup properly captures range keys | ||
# - with foo, delete then insert, and ensure no original data surfaces in restore | ||
# - with baz: chill for now | ||
|
||
kv request=DeleteRange target=foo | ||
---- | ||
|
||
exec-sql | ||
INSERT INTO foo VALUES (3,'z'); | ||
---- | ||
|
||
exec-sql | ||
BACKUP INTO 'nodelocal://0/test-root/'; | ||
---- | ||
|
||
exec-sql | ||
RESTORE DATABASE orig FROM LATEST IN 'nodelocal://0/test-root/' with new_db_name='orig1'; | ||
---- | ||
|
||
query-sql | ||
SELECT count(*) from orig1.foo; | ||
---- | ||
1 | ||
|
||
query-sql | ||
SELECT count(*) from orig1.baz; | ||
---- | ||
2 | ||
|
||
exec-sql | ||
DROP DATABASE orig1 CASCADE | ||
---- | ||
|
||
# Ensure incremental backup without revision history | ||
# handles range tombstones: | ||
# - with foo, insert and ensure latest data from previous backup surfaces in RESTORE | ||
# - with baz, delete then insert, and ensure no data from previous backup surfaces in RESTORE | ||
|
||
exec-sql | ||
INSERT INTO foo VALUES (4,'a'),(5,'b'); | ||
---- | ||
|
||
kv request=DeleteRange target=baz | ||
---- | ||
|
||
exec-sql | ||
INSERT INTO baz VALUES (33,'zz'); | ||
---- | ||
|
||
exec-sql | ||
BACKUP INTO LATEST IN 'nodelocal://0/test-root/'; | ||
---- | ||
|
||
exec-sql | ||
RESTORE DATABASE orig FROM LATEST IN 'nodelocal://0/test-root/' with new_db_name='orig1'; | ||
---- | ||
|
||
query-sql | ||
SELECT count(*) from orig1.foo | ||
---- | ||
3 | ||
|
||
query-sql | ||
SELECT count(*) from orig1.baz | ||
---- | ||
1 | ||
|
||
|
||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes me a little sad we have to open up an iterator on this twice, and read the footer, the index, etc again just to change the iter opts.
I wonder if it'd make sense to include "OnlyPointKeys" in the ExportResponse or something the client could use as an optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the infrastructure in place to reconfigure an iterator, just need to expose the API for it. Want me to submit a PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait is the idea to add
KeyTypes: storage.IterNonInterleavedPointAndRangeKeys,
i.e. a single iterator will surface all point keys and then all range keys in the given span? That seems great to me.I'm not sure if ExportRequest needs to change, but I'll leave that decision to Erik.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I'm unsure how big a perf improvement we would get. I'm going to benchmark the current PR, as is, to understand how big of a regression there is by introducing these two iterators here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, I think David's idea is for the
ExportResponse
to contain information about whether the SST contains point and/or range keys, and my suggestion is to exportMVCCIterator.SetOptions()
so you can doiter.SetOptions(IterOptions{KeyTypes: IterKeyTypeRangesOnly}))
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha, makes sense! That seems like a worthwhile quick win -- especially for backups without revision history, which will never export range keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, exactly, if we somehow know we didn't export a range key, we could just skip re-opening/reconfiguring the iterator entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that we're already tracking that metadata in the SST itself though. Can we get that from the SST reader @jbowens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether or not the table has a range key? Yeah, there's a sstable property recording the count. We can test
reader.Properties.NumRangeKeys() > 0