kvserver: reduce `SysBytes` MVCC stats race during merges #99017

erikgrinaker · 2023-03-20T12:03:00Z

During a range merge, we subsume the RHS and ship its MVCC stats via the merge trigger to add them to the LHS stats. Since the RHS range ID-local keys aren't present in the merged range, the merge trigger computed these and subtracted them from the given stats. However, this could race with a lease request, which ignores latches and writes to the range ID-local keyspace, resulting in incorrect SysBytes MVCC stats.

This patch instead computes the range ID-local MVCC stats during subsume and sends them via a new RangeIDLocalMVCCStats field. This still doesn't guarantee that they're consistent with the RHS's in-memory stats, since the latch-ignoring lease request can update these independently of the subsume request's engine snapshot. However, it substantially reduces the likelihood of this race.

While it would be possible to prevent this race entirely by introducing additional synchronization between lease requests and merge application, this would likely come with significant additional complexity, which doesn't seem worth it just to avoid SysBytes being a few bytes wrong. The main fallout is a log message when the consistency checker detects the stats mismatch, and potential test flake. This PR therefore settles for best-effort prevention.

Resolves #93896.
Resolves #94876.
Resolves #99010.

Epic: none
Release note: None

blathers-crl · 2023-03-20T12:03:04Z

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

cockroach-teamcity · 2023-03-20T12:03:11Z

This change is

erikgrinaker · 2023-03-20T12:07:09Z

This will be racy too, because we can still apply a lease request concurrently with the subsume request, which will affect the in-memory stats. Will think about something better.

During a range merge, we subsume the RHS and ship its MVCC stats via the merge trigger to add them to the LHS stats. Since the RHS range ID-local keys aren't present in the merged range, the merge trigger computed these and subtracted them from the given stats. However, this could race with a lease request, which ignores latches and writes to the range ID-local keyspace, resulting in incorrect `SysBytes` MVCC stats. This patch instead computes the range ID-local MVCC stats during subsume and sends them via a new `RangeIDLocalMVCCStats` field. This still doesn't guarantee that they're consistent with the RHS's in-memory stats, since the latch-ingnoring lease request can update these independently of the subsume request's engine snapshot. However, it substantially reduces the likelihood of this race. While it would be possible to prevent this race entirely by introducing additional synchronization between lease requests and merge application, this would likely come with significant additional complexity, which doesn't seem worth it just to avoid `SysBytes` being a few bytes wrong. The main fallout is a log message when the consistency checker detects the stats mismatch, and potential test flake. This PR therefore settles for best-effort prevention. Epic: none Release note: None

erikgrinaker · 2023-03-20T20:17:51Z

Decided to live with the race for now, and settle for significantly reducing the odds of it happening.

erikgrinaker · 2023-03-21T08:03:27Z

Did 50.000 runs for 12 hours overnight that only exercised splits and merges, with no failures. Previously, it would flake within 10-20 minutes. So this seems like a substantial improvement.

pav-kv

LGTM. I get that this reduces the likelihood of the race, but not sure why. Could you explain a bit why doing so in subsume is better than in merge trigger? Does this reduce the window of time within which the race can happen?

erikgrinaker · 2023-03-27T13:19:16Z

bors r+

Could you explain a bit why doing so in subsume is better than in merge trigger? Does this reduce the window of time within which the race can happen?

Yes, exactly. During subsume evaluation, the race window is between when we acquire the engine read snapshot and when we fetch the in-memory MVCC stats. During the merge trigger, we additionally have to wait for the subsume command to be replicated to and applied on all replicas, and then for the final merge commit request to be evaluated.

craig · 2023-03-27T13:50:10Z

Build succeeded:

Bazel Essential CI (Cockroach)

knz · 2023-04-02T11:55:07Z

should this be backported to 23.1?

erikgrinaker · 2023-04-02T12:05:07Z

No, the risk/reward isn't justifiable. We did #99244 instead.

erikgrinaker requested a review from a team March 20, 2023 12:03

erikgrinaker self-assigned this Mar 20, 2023

erikgrinaker mentioned this pull request Mar 20, 2023

kvserver: 10-byte discrepancy in SysBytes #93896

Open

erikgrinaker force-pushed the merge-sysbytes-race branch from 5bdc1e1 to 2d855d3 Compare March 20, 2023 20:15

erikgrinaker changed the title ~~kvserver: fix SysBytes MVCC stats race during merges~~ kvserver: reduce SysBytes MVCC stats race during merges Mar 20, 2023

erikgrinaker marked this pull request as ready for review March 20, 2023 20:18

erikgrinaker requested a review from a team as a code owner March 20, 2023 20:18

erikgrinaker requested a review from pav-kv March 22, 2023 09:13

pav-kv approved these changes Mar 27, 2023

View reviewed changes

craig bot merged commit 3c3d2a5 into cockroachdb:master Mar 27, 2023

erikgrinaker mentioned this pull request Apr 3, 2023

roachtest: clearrange/checks=true/rangeTs=true failed #100431

Closed

erikgrinaker mentioned this pull request Apr 18, 2023

kv/kvnemesis: TestKVNemesisMultiNode failed #101721

Closed

erikgrinaker mentioned this pull request May 23, 2023

kvserver/batcheval: fix mvcc stats computaion on split #103719

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: reduce `SysBytes` MVCC stats race during merges #99017

kvserver: reduce `SysBytes` MVCC stats race during merges #99017

erikgrinaker commented Mar 20, 2023 •

edited

Loading

blathers-crl bot commented Mar 20, 2023

cockroach-teamcity commented Mar 20, 2023

erikgrinaker commented Mar 20, 2023

erikgrinaker commented Mar 20, 2023

erikgrinaker commented Mar 21, 2023

pav-kv left a comment

erikgrinaker commented Mar 27, 2023

craig bot commented Mar 27, 2023

knz commented Apr 2, 2023

erikgrinaker commented Apr 2, 2023

kvserver: reduce SysBytes MVCC stats race during merges #99017

kvserver: reduce SysBytes MVCC stats race during merges #99017

Conversation

erikgrinaker commented Mar 20, 2023 • edited Loading

blathers-crl bot commented Mar 20, 2023

cockroach-teamcity commented Mar 20, 2023

erikgrinaker commented Mar 20, 2023

erikgrinaker commented Mar 20, 2023

erikgrinaker commented Mar 21, 2023

pav-kv left a comment

Choose a reason for hiding this comment

erikgrinaker commented Mar 27, 2023

craig bot commented Mar 27, 2023

knz commented Apr 2, 2023

erikgrinaker commented Apr 2, 2023

kvserver: reduce `SysBytes` MVCC stats race during merges #99017

kvserver: reduce `SysBytes` MVCC stats race during merges #99017

erikgrinaker commented Mar 20, 2023 •

edited

Loading