Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: prevent restoring a newer backup #61737

Merged
merged 1 commit into from
Mar 18, 2021

Conversation

dt
Copy link
Member

@dt dt commented Mar 9, 2021

This adds a check during RESTORE that the backup(s) being restored were produced by a version
no newer than the current version. The build info for the version that produced a
backup is persisted in the manifest and is used for this check.

Note: this is checking the build version, not the cluster version. A follow-up
change should likely be added to persist cluster versions and check them as well
so that most-migration data cannot be restored to a pre-migration cluster even if
both are on the same binary version.

Release note (ops change): RESTORE now refuses to restore BACKUPs made by newer versions.

@dt dt requested review from pbardea, Elliebababa and a team March 9, 2021 21:09
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@dt dt force-pushed the restore-version-check branch from 1e051d5 to d8e5802 Compare March 9, 2021 21:12
@pbardea
Copy link
Contributor

pbardea commented Mar 9, 2021

I have a couple of questions (and want to make sure that my understanding of what we do in these scenarios is sane):

  1. Since patch releases are backwards compatible, I would expect that 20.2.4 backups could be restored on 20.2.1 nodes, correct?
  2. When considering mixed-version clusters are we okay with disallowing a restore of a backup made by a node with a later build number? (E.g. mixed version 20.2, 21.1 cluster - a backup taken by a 20.2 node should be restorable by the 21.1 node given the cluster version hasn't been upgraded.) I don't think we currently have any issues allowing such a restore from happening (although, granted is not something super well tested), and it is before the "burn the boats" cluster upgrade finalization step, so it may be a surprise to users.

I think this is a good indication that we may actually need to store the cluster version to do this check? And I think it would be good to start persisting that in 21.1 before it sets sail.

@dt
Copy link
Member Author

dt commented Mar 9, 2021

re:1) ah, yeah, good point, I should just be comparing Major. Done.

  1. Yeah, I have another branch with the cluster-version persisted since, as I mention in the commit here, I think we also need that, since even within the same binary running the whole time we could lose compatibility once you run a one-way migrate that should block restoring to pre-migrate version. But I wanted to do that separately given existing backups don't persist it. I'm less sure that I like the strictness that a backups taken after the binary upgrade but before the logical upgrade will only be restorable by the new binary -- that kinda feels like it defeats the point of the soft-upgrade window.

@blathers-crl
Copy link

blathers-crl bot commented Mar 9, 2021

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Mar 9, 2021
@pbardea
Copy link
Contributor

pbardea commented Mar 9, 2021

I'm less sure that I like the strictness that a backups taken after the binary upgrade but before the logical upgrade will only be restorable by the new binary -- that kinda feels like it defeats the point of the soft-upgrade window.

Agreed, and just to spell it out completely, I think that this means that we can't do this build version check in that case right? Since in the mixed-version situation I described a new node may take a backup with build tag 21.1, but that backup would not be restorable by the older nodes in the same cluster that would have build version 20.2.

@dt dt force-pushed the restore-version-check branch from 0416a80 to 3e5d2d1 Compare March 9, 2021 23:02
@dt
Copy link
Member Author

dt commented Mar 9, 2021

Thought about this a little more and opted to just do the cluster version one now instead. I'd initially held off on it, thinking we'd wait to use it instead of build info in a later version once it'd been being persisted for a release, but on reflection it doesn't matter: if only new nodes are doing the check, it is fine if they look for a field only written by nodes at least that new since they're specifically look for newer nodes anyway. Reworked to be all cluster-version.

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor testing comment

sqlDB.ExpectErr(t, "backup from version 99.1 is newer than current version", `RESTORE DATABASE r1 FROM 'nodelocal://0/too_new'`)

// Bump the version down and write it back out to make it look older.
backupManifest.ClusterVersion = roachpb.Version{Major: 20, Minor: 2, Internal: 2}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a test case where we nil out the cluster version to simulate earlier backups?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. Done.

pkg/ccl/backupccl/restore_planning.go Show resolved Hide resolved
@dt dt force-pushed the restore-version-check branch from 3e5d2d1 to a55f668 Compare March 17, 2021 21:32
This adds a check during RESTORE that the backup(s) being restored were
produced by a cluster with a cluster version no newer than the current
version. The cluster version was not previously persisted in the backup
so this new field is only set on new backups, but as this check is only
in new versions, that is fine.

Release note (ops change): RESTORE now refuses to restore BACKUPs made by newer versions.
@dt dt force-pushed the restore-version-check branch from a55f668 to 2c4257d Compare March 18, 2021 14:02
@dt
Copy link
Member Author

dt commented Mar 18, 2021

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 18, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants