-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv/kvserver: TestRequestsOnLaggingReplica failed #57932
Comments
kv/kvserver.TestRequestsOnLaggingReplica failed with artifacts on master @ e179c8efcb1cf78ed940c3a0290d84fd1a5590e9:
Reproduce
To reproduce, try: make stressrace TESTS=TestRequestsOnLaggingReplica PKG=./pkg/kv/kvserver TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 Parameters in this failure:
|
@andreimatei any thoughts on this recent test failure? |
I'll try to take a look. |
I produced the following in 30min or
|
In some rare cases, the status of a lease in relation to a request/timestamp can't be determined. For the request's client this results in a NotLeaseholderError. This patch improves the message of that error. In particular, this test failure[1] seems to show that a node couldn't verify that an existing lease is expired because its liveness gossiped info was stale. This sounds interesting and if the test fails again this improved message should help. [1] cockroachdb#57932 (comment) Release note: None
64080: kvserver: improve error message for rare lease errors r=andreimatei a=andreimatei In some rare cases, the status of a lease in relation to a request/timestamp can't be determined. For the request's client this results in a NotLeaseholderError. This patch improves the message of that error. In particular, this test failure[1] seems to show that a node couldn't verify that an existing lease is expired because its liveness gossiped info was stale. This sounds interesting and if the test fails again this improved message should help. [1] #57932 (comment) Release note: None 64239: kv: bump kv.range_merge.queue_interval to 5s r=nvanbenschoten a=nvanbenschoten Informs #62700. This commit bumps the default value for the `kv.range_merge.queue_interval` cluster setting from 1s to 5s. This setting serves as a per-store rate limit on the frequency at which range merges will be initiated. We've seen in a few issues, including #62700, that excessive range merge traffic can cause instability in a cluster. There's very little reason to be aggressive about range merging, as range merges are rarely needed with any urgency. However, there are good reasons to be conservative about them. This change can also be justified as a (late) reaction to the increased max range size from 64MB to 512MB. A range merge may need to rebalance replicas in a range, so its cost can be a function of the sizes of ranges. This means that if range merges are now more expensive, we should be running them less frequently. 64260: roachpb: remove EndTxn's DeprecatedCanCommitAtHigherTimestamp field r=nvanbenschoten a=nvanbenschoten The field was replaced with the more general BatchRequest.CanForwardReadTimestamp in 972915d. This commit completes the migration to remove the old flag. We attempted this before, in 93d5eb9, but had to back that out in 4189938 because we still needed compatibility with v20.1 nodes at the time. That is no longer the case. Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>
kv/kvserver.TestRequestsOnLaggingReplica failed with artifacts on master @ cfc525fe19e848f324baaf9937ce12a4d029fec2:
ReproduceTo reproduce, try: make stressrace TESTS=TestRequestsOnLaggingReplica PKG=./pkg/kv/kvserver TESTTIMEOUT=5m STRESSFLAGS='-timeout 5m' 2>&1 Parameters in this failure:
|
Did 3000 |
Closing as stale |
(kv/kvserver).TestRequestsOnLaggingReplica failed on master@93615a6b16071749c8d412b29978f99e57944081:
More
Parameters:
See this test on roachdash
powered by pkg/cmd/internal/issues
Jira issue: CRDB-3459
The text was updated successfully, but these errors were encountered: