release-2.1: storage: preserve consistency when applying widening preemptive snapshots #29694
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #29677.
/cc @cockroachdb/release
Merges can cause preemptive snapshots that widen existing replicas. For
example, consider the following sequence of events:
collected.
In step 3, S will receive a preemptive snapshot for A that requires
widening its existing replica, thanks to the intervening merge.
Problematically, the code to check whether this widening was possible,
in Store.canApplySnapshotLocked, was incorrectly mutating the range
descriptor in the snapshot header! Applying the snapshot would then fail
to clear all of the data from the old incarnation of the replica, since
the bounds on the range deletion tombstone were wrong. This often
resulted in replica inconsistency. Plus, the in-memory copy of the range
descriptor would be incorrect until the next descriptor update--though
this usually happened quickly, as the replica would apply the change
replicas command, which updates the descriptor, soon after applying the
preemptive snapshot.
To fix the problem, teach Store.canApplySnapshotLocked to make a copy of
the range descriptor before it mutates it.
To prevent regressions, add an assertion that a range's start key is
never changed to the descriptor update path. With this assertion in
place, but without the fix itself,
TestStoreRangeMergeReadoptedLHSFollower reliably fails.
Fixes #29252.
Release note: None