-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvnemesis: add non-blocking transactions #59062
Labels
A-testing
Testing tools and infrastructure
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
GA-blocker
Comments
craig bot
pushed a commit
that referenced
this issue
Apr 8, 2021
62580: kv: deflake multi-node kvnemesis r=nvanbenschoten a=nvanbenschoten Fixes #61322. Prep to address #59062. #### kv: don't consider Subsume error as assertion failures These were added in b15e3dd, but that was incorrect. It is valid for a merge that races with another merge or split to hit these case, as the Subsume request is non-transactional. Also, allow `RHS range bounds do not match` errors in kvnemesis. This is a valid error for the reasons mentioned above. #### kv: mark error from Replica.GetSnapshot with errMarkSnapshotError Without it, we see the following errors in kvnemesis: ``` Wraps: (2) error applying x.AdminChangeReplicas(ctx, /Table/50/"244956da", [{ADD_VOTER n2,s2}]) // [n1,s1,r36/1:/Table/{40-50/"ba86b…}]: failed to generate LEARNER_INITIAL snapshot: couldn't find range descriptor ``` #### kv: allow two more errors with TransferLeaseOperation This deflakes TestKVNemesisMultiNode. The test had recently gotten flakier, which I bisected back to 3fe1992. It appears that the changes in that commit made it more likely to hit this `replica not found in RangeDescriptor`, which is returned after a TransferLease request has acquired latches, in addition to the existing `unable to find store \d+ in range` error, which is returned before a TransferLease request has acquired latches. Co-authored-by: Nathan VanBenschoten <[email protected]>
craig bot
pushed a commit
that referenced
this issue
Apr 27, 2021
63747: kvnemesis: toggle global_reads attribute in zone configs r=nvanbenschoten a=nvanbenschoten Closes #59062. This commit adds a new `ChangeZoneOperation` with the option to toggle the `global_reads` attribute to KV nemesis. This testing ensures that global read ranges are stressed with a variety of different conditions, including range splits, range merges, rebalancing, lease transfers, rangefeeds, and more. So far, there has been one bug revealed by this testing where a range merge watcher incorrectly considered a merge to have succeeded due to a "stale read" when the merge actually failed. This will be fixed in a separate PR that I'll merge before this one. In addition, I reverted a few key changes to demonstrate that this testing would have revealed other issues had they not been addressed proactively. ```go // 1. if in TransferLease, reads above new lease start time are ignored priorReadSum = rspb.FromTimestamp(newLease.Start.ToTimestamp()) // then after 507 runs over 1m11s committed txn non-atomic timestamps: [w]/Table/50/"0d0e019b":1617159805.036363000,2?->v-11 [r]/Table/50/"22ebd28f":[<min>, <max>)-><nil> [s]/Table/50/"{3059eb81"-b379f625"}:{0:[1617159804.712970000,1, <max>), 1:[1617159804.712970000,1, <max>), 2:[1617159804.712970000,1, <max>), gap:[<min>, 1617159804.748590000,0)}->[/Table/50/"380e4125":v-10, /Table/50/"3812daf0":v-9, /Table/50/"3c11a3d5":v-7] [r]/Table/50/"68767751":[<min>, <max>)-><nil> [w]/Table/50/"d05c9e5f":1617159805.036363000,2?->v-12 // 2. if in Subsume, reads above freeze time are ignored priorReadSum = rspb.FromTimestamp(reply.FreezeStart.ToTimestamp()) // then after many runs it would have hit something similar // 3. if in (*txnwait.Queue).forcePushAbort we did not forward the PushRequest // timestamp to req.PushTo (i.e. if we reverted 8ba492c) // b.Header.Timestamp.Forward(req.PushTo) // then after 29 runs over 8s error applying x.ScanForUpdate(ctx, /Table/50/"9e80b889", /Table/50/"e2985e21", 0) // (nil, request timestamp 1618333326.405544000,0 less than PushTo timestamp 1618333326.631766000,3?): request timestamp 1618333326.405544000,0 less than PushTo timestamp 1618333326.631766000,3? ``` Co-authored-by: Nathan VanBenschoten <[email protected]>
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
May 20, 2021
Closes cockroachdb#59062. This commit adds a new `ChangeZoneOperation` with the option to toggle the `global_reads` attribute to KV nemesis. This testing ensures that global read ranges are stressed with a variety of different conditions, including range splits, range merges, rebalancing, lease transfers, rangefeeds, and more. So far, there has been one bug revealed by this testing where a range merge watcher incorrectly considered a merge to have succeeded due to a "stale read" when the merge actually failed. This will be fixed in the following commit. In addition, I reverted a few key changes to demonstrate that this testing would have revealed other issues had they not been addressed proactively. ```go // 1. if in TransferLease, reads above new lease start time are ignored priorReadSum = rspb.FromTimestamp(newLease.Start.ToTimestamp()) // then after 507 runs over 1m11s committed txn non-atomic timestamps: [w]/Table/50/"0d0e019b":1617159805.036363000,2?->v-11 [r]/Table/50/"22ebd28f":[<min>, <max>)-><nil> [s]/Table/50/"{3059eb81"-b379f625"}:{0:[1617159804.712970000,1, <max>), 1:[1617159804.712970000,1, <max>), 2:[1617159804.712970000,1, <max>), gap:[<min>, 1617159804.748590000,0)}->[/Table/50/"380e4125":v-10, /Table/50/"3812daf0":v-9, /Table/50/"3c11a3d5":v-7] [r]/Table/50/"68767751":[<min>, <max>)-><nil> [w]/Table/50/"d05c9e5f":1617159805.036363000,2?->v-12 // 2. if in Subsume, reads above freeze time are ignored priorReadSum = rspb.FromTimestamp(reply.FreezeStart.ToTimestamp()) // then after many runs it would have hit something similar // 3. if in (*txnwait.Queue).forcePushAbort we did not forward the PushRequest // timestamp to req.PushTo (i.e. if we reverted 8ba492c) // b.Header.Timestamp.Forward(req.PushTo) // then after 29 runs over 8s error applying x.ScanForUpdate(ctx, /Table/50/"9e80b889", /Table/50/"e2985e21", 0) // (nil, request timestamp 1618333326.405544000,0 less than PushTo timestamp 1618333326.631766000,3?): request timestamp 1618333326.405544000,0 less than PushTo timestamp 1618333326.631766000,3? ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-testing
Testing tools and infrastructure
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
GA-blocker
Once they exist.
The text was updated successfully, but these errors were encountered: