-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: spurious ambiguous errors can be returned #129427
Comments
Hi @andrewbaptist, please add branch-* labels to identify which branch(es) this C-bug affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There are a few different failure modes. I'll put them in different commennts. |
Failure mode 1:
There are a few ways to handle this, but the fundamental problem is the “contract” between The current contract between these methods is too weak. Specifically in the case of both a Some options:
|
Failure mode 2:
The rest of the analysis is basically the same. |
Previously the test could return an ambiguous error if there was a combination of a `RangeMismatchError` and an `AmbiguousError`. This would request would be successful if the range info was updated and the request retried, but we don't do that today. Informs: cockroachdb#129427 Epic: none Release note: None
This test can fail with an ambiguous error if there was a combination of a `RangeMismatchError` and an `AmbiguousError` from different nodes. This error combination would be successful if the range descriptor is updated and the request retried. We incorrectly don't retry this today. Informs: cockroachdb#129427 Epic: none Release note: None
129829: roachtest: tolerate errors on gracefuldraining test r=kvoli a=andrewbaptist This test can fail with an ambiguous error if there was a combination of a `RangeMismatchError` and an `AmbiguousError` from different nodes. This error combination would be successful if the range descriptor is updated and the request retried. We incorrectly don't retry this today. Informs: #129427 Epic: none Release note: None Co-authored-by: Andrew Baptist <[email protected]>
This test can fail with an ambiguous error if there was a combination of a `RangeMismatchError` and an `AmbiguousError` from different nodes. This error combination would be successful if the range descriptor is updated and the request retried. We incorrectly don't retry this today. Informs: #129427 Epic: none Release note: None
This test can fail with an ambiguous error if there was a combination of a `RangeMismatchError` and an `AmbiguousError` from different nodes. This error combination would be successful if the range descriptor is updated and the request retried. We incorrectly don't retry this today. Informs: cockroachdb#129427 Epic: none Release note: None
This test can fail with an ambiguous error if there was a combination of a `RangeMismatchError` and an `AmbiguousError` from different nodes. This error combination would be successful if the range descriptor is updated and the request retried. We incorrectly don't retry this today. Informs: cockroachdb#129427 Epic: none Release note: None
Describe the problem
The test
kv/gracefuldraining
fails without disabling the max gossip frequency.Customers expect to not hit ambiguous errors during clean drains and shutdowns. This is a somewhat expected and rare error and we expect customers to retry, but it is still a regression in behavior.
To Reproduce
Remove the
--tolerate-errors
line from the test. It will fail ~50% of the time.Expected behavior
Customers expect to not hit ambiguous errors during clean drains and shutdowns. This is a somewhat expected and rare error and we expect customers to retry, but it is still a regression in behavior.
Jira issue: CRDB-41538
Epic CRDB-39956
The text was updated successfully, but these errors were encountered: