-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver,rac2: turn off raftReceiveQueue.maxLen enforcement in apply_… #136969
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add tests if the approach looks ok.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @pav-kv)
061e7c9
to
9654320
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks ok.
Likewise from me.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)
9654320
to
da7f553
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Tests are ready.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @pav-kv)
da7f553
to
49fbd0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @pav-kv and @sumeerbhola)
pkg/kv/kvserver/store_raft_test.go
line 233 at r2 (raw file):
defer leaktest.AfterTest(t)() skip.UnderStress(t, "slow test")
nit: duress will also skip under stress:
cockroach/pkg/testutils/skip/skip.go
Line 185 in 46e8e0f
return util.RaceEnabled || Stress() || syncutil.DeadlockEnabled |
pkg/kv/kvserver/store_raft_test.go
line 284 at r2 (raw file):
checkingMu.Unlock() enforceMaxLen = !enforceMaxLen time.Sleep(time.Millisecond)
Whats the sleep for?
49fbd0e
to
69f06ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kvoli and @pav-kv)
pkg/kv/kvserver/store_raft_test.go
line 233 at r2 (raw file):
Previously, kvoli (Austen) wrote…
nit: duress will also skip under stress:
cockroach/pkg/testutils/skip/skip.go
Line 185 in 46e8e0f
return util.RaceEnabled || Stress() || syncutil.DeadlockEnabled
Done
pkg/kv/kvserver/store_raft_test.go
line 284 at r2 (raw file):
Previously, kvoli (Austen) wrote…
Whats the sleep for?
I was trying various ways to make this fail by introducing bugs in the real code. Found that doing a sleep here gives more opportunity to various goroutines to enter their read critical section and call LoadOrCreate
, which then ends up being concurrent with the call to SetEnforceMaxLen
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @pav-kv)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @pav-kv and @sumeerbhola)
pkg/kv/kvserver/store_raft_test.go
line 284 at r2 (raw file):
Previously, sumeerbhola wrote…
I was trying various ways to make this fail by introducing bugs in the real code. Found that doing a sleep here gives more opportunity to various goroutines to enter their read critical section and call
LoadOrCreate
, which then ends up being concurrent with the call toSetEnforceMaxLen
.
Ack, that makes sense then. I actually thought it might have been for the opposite reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @kvoli and @sumeerbhola)
pkg/kv/kvserver/store_raft_test.go
line 284 at r2 (raw file):
Previously, kvoli (Austen) wrote…
Ack, that makes sense then. I actually thought it might have been for the opposite reason.
Add a comment to this extent?
pkg/kv/kvserver/store_raft.go
line 158 at r3 (raw file):
q.acc.Init(context.Background(), qs.mon) q, loaded = qs.m.LoadOrStore(rangeID, q) if !loaded {
style nit: don't indent large blocks
if loaded {
return q, true
}
// The sampling of ...
for {
...
}
return q, false
pkg/kv/kvserver/store_raft_test.go
line 255 at r3 (raw file):
rng, _ := randutil.NewTestRand() defer wg.Done() for {
Add a limit to how much this can spin?
…to_all mode The existing maxLen enforcement is already dubious: - Length does not equal bytes, so offers limited protection from OOMs. - The limit is per replica and not an aggregate. - We run a cooperative system, and historically the sender has respected RaftConfig.RaftMaxInflightBytes, which is a byte limit. The only reason for additional protection on the receiver is when there are rapid repeated leader changes for a large number of ranges for which the receiver has replicas. Even in this case, the behavior is surprising since the receive queue overflows even though the sender has done nothing wrong -- and it is very unlikely that this overflow is actually protecting against an OOM. With RACv2 in apply_to_all mode, the senders have a 16MiB regular token pool that applies to a whole (store, tenant) pair. This is stricter than the per range defaultRaftMaxInflightBytes (32MiB), both in value, and because it is an aggregate limit. The aforementioned "only reason" is even more unnecessary. So we remove this in the case of apply_to_all. An alternative would be to have replicaSendStream respect the receiver limit in apply_to_all mode. The complexity there is that replicaSendStream grabs a bunch of tokens equal to the byte size of the send-queue, and expects to send all those messages. To respect a count limit, it will need to quickly return tokens it can't use (since they are a shared resource), which adds complexity to the already complex token management logic. Fixes cockroachdb#135851 Epic: none Release note: None
69f06ee
to
ae3299b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kvoli and @pav-kv)
pkg/kv/kvserver/store_raft.go
line 158 at r3 (raw file):
Previously, pav-kv (Pavel Kalinnikov) wrote…
style nit: don't indent large blocks
if loaded { return q, true } // The sampling of ... for { ... } return q, false
Done
pkg/kv/kvserver/store_raft_test.go
line 284 at r2 (raw file):
Previously, pav-kv (Pavel Kalinnikov) wrote…
Add a comment to this extent?
Done
pkg/kv/kvserver/store_raft_test.go
line 255 at r3 (raw file):
Previously, pav-kv (Pavel Kalinnikov) wrote…
Add a limit to how much this can spin?
We want these goroutines to keep working while the code below is changing enfoceMaxLen. There shouldn't be two places to decide when to stop since then there will be some time interval where the test is ineffective but still running.
I've added the following comment to make it clear that this loop will stop.
// Loop until the doneCh is closed.
bors r=kvoli,pav-kv |
…to_all mode
The existing maxLen enforcement is already dubious:
With RACv2 in apply_to_all mode, the senders have a 16MiB regular token pool that applies to a whole (store, tenant) pair. This is stricter than the per range defaultRaftMaxInflightBytes (32MiB), both in value, and because it is an aggregate limit. The aforementioned "only reason" is even more unnecessary. So we remove this in the case of apply_to_all.
An alternative would be to have replicaSendStream respect the receiver limit in apply_to_all mode. The complexity there is that replicaSendStream grabs a bunch of tokens equal to the byte size of the send-queue, and expects to send all those messages. To respect a count limit, it will need to quickly return tokens it can't use (since they are a shared resource), which adds complexity to the already complex token management logic.
Fixes #135851
Epic: none
Release note: None