-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: handle range-level rebalancing of non-voting replicas #61239
kvserver: handle range-level rebalancing of non-voting replicas #61239
Conversation
Apologies for taking so damn long to get this out. |
4bd8c05
to
4b1c5ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! It's exciting to see the tests you're adding alongside the logic improvements which demonstrate the growing intelligence of the allocator and replicateQueue.
Reviewed 9 of 9 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15)
pkg/kv/kvserver/allocator.go, line 576 at r2 (raw file):
} func getReplicaSetForDiversityCalc(
Let's give this function a comment.
pkg/kv/kvserver/allocator.go, line 606 at r2 (raw file):
panic(fmt.Sprintf("unsupported targetReplicaType: %v", t)) } // Unreachable.
nit: you may not even need this. Go can detect when a switch statement is exhaustive in some cases. I think it's able to detect it for panic
s, but not for log.Fatal
s.
pkg/kv/kvserver/allocator.go, line 689 at r2 (raw file):
var constraintsChecker constraintsCheckFn existingReplicaSet := getReplicaSetForDiversityCalc(targetType, existingVoters, existingReplicas)
nit: move this down, right above candidates :=
, since it's no longer computed in the switch block.
pkg/kv/kvserver/allocator.go, line 706 at r2 (raw file):
candidateStores, constraintsChecker, // We'll consider the targets that have a non-voter as feasible
This comment checks out, but I don't understand its relation to this code. Would we be passing in a different replica set if this was not true? Does this comment belong somewhere else? For instance, above the call to getReplicaSetForDiversityCalc
?
pkg/kv/kvserver/allocator.go, line 749 at r2 (raw file):
case voterTarget: a.storePool.updateLocalStoreAfterRebalance(targetStore, rangeUsageInfo, roachpb.ADD_VOTER) defer func() {
nit here and below: can this just be defer a.storePool.updateLocalStoreAfterRebalance(targetStore, rangeUsageInfo, roachpb.REMOVE_VOTER)
? We're not capturing any state that will change between the defer
registration and the defer
execution, right?
pkg/kv/kvserver/allocator.go, line 769 at r2 (raw file):
} // Unreachable.
Similar comment. Let's panic in the default case and then get rid of this. The less code, the less chance we got something wrong and accidentally fall through into here.
pkg/kv/kvserver/allocator.go, line 1095 at r2 (raw file):
// replicas. This is in contrast to how we compute the diversity scores for // voting replicas, which are computed relative to just the set of voting // replicas.
Non-voters also only adhere to the overall constraints and not the voter-constraints, right?
pkg/kv/kvserver/allocator_scorer.go, line 642 at r2 (raw file):
} if !cand.less(existing) { // If `cand` is not worse than `existing`, add it to the list.
This code looks familiar.
pkg/kv/kvserver/allocator_test.go, line 3251 at r2 (raw file):
// 1. Non-voter rebalancing obeys the allocator's capacity based heuristics. // 2. Non-voter rebalancing tries to ensure constraints conformance. func TestAllocatorRebalanceNonVoters(t *testing.T) {
Nice test!
pkg/kv/kvserver/allocator_test.go, line 3279 at r2 (raw file):
name: "diversity among non-voters", stores: multiDiversityDCStores, zone: zonepb.EmptyCompleteZoneConfig(),
I'm confused why non-voters aren't being removed if we're using an empty zone config that does not specify num_voters. Is this test case going to continue working after your future changes?
pkg/kv/kvserver/allocator_test.go, line 3301 at r2 (raw file):
existingNonVoters: replicas(1), // NB: Store 1 has a 97.5% full disk. stores: oneStoreWithFullDisk,
nit: keep these fields in order in each test case.
pkg/kv/kvserver/client_replica_test.go, line 2329 at r2 (raw file):
// Ensure that these swaps do not work without atomic replication changes // enabled. _, err = tc.ServerConn(0).Exec(`SET CLUSTER SETTING kv.atomic_replication_changes.enabled = false`)
Do we need this anymore? Now that you've removed the setting?
pkg/kv/kvserver/replicate_queue.go, line 520 at r2 (raw file):
} rq.metrics.AddReplicaCount.Inc(1) ops := roachpb.MakeReplicationChanges(roachpb.ADD_VOTER, newReplica)
We cant use roachpb.MakeReplicationChanges
anymore?
pkg/kv/kvserver/replicate_queue.go, line 534 at r2 (raw file):
rq.metrics.AddReplicaCount.Inc(1) var ops []roachpb.ReplicationChange
Give this block of code a comment. Something about determining whether the allocator is instructing us to promote an existing non_voter or to add an entirely new replica.
pkg/kv/kvserver/replicate_queue.go, line 557 at r2 (raw file):
log.VEventf(ctx, 1, "replacing replica %s with %+v: %s", removeReplica, newReplica, rangeRaftProgress(repl.RaftStatus(), existingVoters)) // NB: We may have performed a promotion of a non-voter above, but we will
Could you be a bit more explicit in this comment? Maybe just "but we will not perform a demotion here and will instead remove the voter entirely" to make sure readers are on the same page about what the code is doing.
pkg/kv/kvserver/replicate_queue.go, line 926 at r2 (raw file):
storeFilterThrottled) if !ok { log.VEventf(ctx, 1, "no suitable rebalance target")
Why were we falling through and continuing to call changeReplicas
in these error cases? I'm surprised we weren't just returning false, nil
.
pkg/kv/kvserver/replicate_queue.go, line 1103 at r2 (raw file):
chgs = append(promo, demo...) } else if found { log.Errorf(ctx, "programming error: store being rebalanced to(%s) already has a"+
Should we return an AssertionFailed error here?
pkg/kv/kvserver/replicate_queue.go, line 1123 at r2 (raw file):
{ChangeType: roachpb.ADD_NON_VOTER, Target: addTarget}, {ChangeType: roachpb.REMOVE_NON_VOTER, Target: removeTarget}, }
Will there ever be a case where we demote a voter to a non-voter without a corresponding promotion?
pkg/kv/kvserver/replicate_queue_test.go, line 365 at r2 (raw file):
func TestReplicateQueueSwapVotersWithNonVoters(t *testing.T) { defer leaktest.AfterTest(t)() skip.UnderRace(t)
I'm a little concerned that we're skipping under race by default these days. The race detected can be very helpful and has uncovered numerous bugs! We should skip only as a last resort. How long does this test take under race?
Maybe we can drop the numIterations
down under race instead.
pkg/kv/kvserver/replicate_queue_test.go, line 376 at r2 (raw file):
Tiers: []roachpb.Tier{ { Key: "rack", Value: fmt.Sprintf("%d", i),
nit: You can use strconv.Itoa
here. Also, should we shift this up using for i := 1; i <= 5; i++
so that we don't need to store-1
shift down below?
pkg/kv/kvserver/replicate_queue_test.go, line 417 at r2 (raw file):
scratchKey := tc.ScratchRange(t) scratchRange := tc.LookupRangeOrFatal(t, scratchKey)
Doesn't this need to go inside of the SucceedsSoon
?
pkg/kv/kvserver/replicate_queue_test.go, line 422 at r2 (raw file):
return err } if voters := scratchRange.Replicas().VoterDescriptors(); len(voters) != 5 {
Why would we expect 5x replication without setting zone configs? Does scratchRange end up being a meta range?
pkg/kv/kvserver/replicate_queue_test.go, line 430 at r2 (raw file):
checkRelocated := func(t *testing.T, voterStores, nonVoterStores []roachpb.StoreID) { testutils.SucceedsSoon(t, func() error { scratchRange := tc.LookupRangeOrFatal(t, scratchKey)
Should this come after the forceScanOnAllReplicationQueues
?
pkg/kv/kvserver/replicate_queue_test.go, line 478 at r2 (raw file):
go through non-voter promotions and voter demotions.
"go through atomic non-voter promotions and voter demotions."
Give yourself some credit here 😄
pkg/kv/kvserver/store_pool.go, line 386 at r2 (raw file):
return } switch changeType {
Not your code, but this is kind of strange. Do we just rely on all other changeType's being no-ops? Do you think we should make that explicit then and default: return
?
pkg/roachpb/metadata_replicas.go, line 548 at r2 (raw file):
} // ReplicationChangesForPromotion returns the replication changes that
Should these live right below MakeReplicationChanges
?
be26a0e
to
bdd7f8f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 576 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's give this function a comment.
Done.
pkg/kv/kvserver/allocator.go, line 606 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: you may not even need this. Go can detect when a switch statement is exhaustive in some cases. I think it's able to detect it for
panic
s, but not forlog.Fatal
s.
Done.
pkg/kv/kvserver/allocator.go, line 689 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: move this down, right above
candidates :=
, since it's no longer computed in the switch block.
Done.
pkg/kv/kvserver/allocator.go, line 706 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This comment checks out, but I don't understand its relation to this code. Would we be passing in a different replica set if this was not true? Does this comment belong somewhere else? For instance, above the call to
getReplicaSetForDiversityCalc
?
Moved it above the call to getReplicasForDiversityCalc
pkg/kv/kvserver/allocator.go, line 749 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit here and below: can this just be
defer a.storePool.updateLocalStoreAfterRebalance(targetStore, rangeUsageInfo, roachpb.REMOVE_VOTER)
? We're not capturing any state that will change between thedefer
registration and thedefer
execution, right?
Done.
pkg/kv/kvserver/allocator.go, line 769 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Similar comment. Let's panic in the default case and then get rid of this. The less code, the less chance we got something wrong and accidentally fall through into here.
Done.
pkg/kv/kvserver/allocator.go, line 1095 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Non-voters also only adhere to the overall constraints and not the voter-constraints, right?
Done.
pkg/kv/kvserver/allocator_scorer.go, line 642 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This code looks familiar.
🥘
pkg/kv/kvserver/allocator_test.go, line 3279 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm confused why non-voters aren't being removed if we're using an empty zone config that does not specify num_voters. Is this test case going to continue working after your future changes?
The decision of whether a replica needs to be added/removed/rebalanced will be made by Allocator.ComputeAction()
. In the scenarios presented by these tests, we wouldn't expect ComputeAction
to emit a rebalance action, but Rebalance{Non}Voter
is only concerned about the Constraints
and VoterConstraints
from the zone config.
pkg/kv/kvserver/client_replica_test.go, line 2329 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we need this anymore? Now that you've removed the setting?
I hadn't rebased. Fixed now.
pkg/kv/kvserver/replicate_queue.go, line 520 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We cant use
roachpb.MakeReplicationChanges
anymore?
Fixed.
pkg/kv/kvserver/replicate_queue.go, line 534 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Give this block of code a comment. Something about determining whether the allocator is instructing us to promote an existing non_voter or to add an entirely new replica.
Done, lmk if it's lacking.
pkg/kv/kvserver/replicate_queue.go, line 557 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could you be a bit more explicit in this comment? Maybe just "but we will not perform a demotion here and will instead remove the voter entirely" to make sure readers are on the same page about what the code is doing.
Done.
pkg/kv/kvserver/replicate_queue.go, line 926 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why were we falling through and continuing to call
changeReplicas
in these error cases? I'm surprised we weren't just returningfalse, nil
.
I confusedly stared at the code for 15 mins before realizing that we're not actually falling through to call changeReplicas
unless ok == true
. Notice that that call is in the else
block. It's also not clear to me why we want to fall through to the call to shedLease
below, but I'm hesitating to mess with this stuff. What do you think?
pkg/kv/kvserver/replicate_queue.go, line 1103 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we return an AssertionFailed error here?
not sure what I was thinking there. Fixed.
pkg/kv/kvserver/replicate_queue.go, line 1123 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Will there ever be a case where we demote a voter to a non-voter without a corresponding promotion?
In the current implementation, nope. For us to do a demotion without a corresponding promotion, we'd have to know that want to add a non-voter to a store that should lose a voter. Currently, the replicateQueue will just remove the voter first and then try to add a new non-voter in its next pass through. Of course, there's no reason this situation can't be improved. If we can determine that we're 1 non-voter short and that the removal target is a valid candidate for a non-voter, we can just do the demotion. I added a TODO
for this.
pkg/kv/kvserver/replicate_queue_test.go, line 365 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm a little concerned that we're skipping under race by default these days. The race detected can be very helpful and has uncovered numerous bugs! We should skip only as a last resort. How long does this test take under race?
Maybe we can drop the
numIterations
down under race instead.
The frustrating thing about enabling these under race is that they often seem to be too slow on our overloaded CI machines (and often fail because these test servers cannot sustain node liveness), even though they're running with low parallelism. Could we possibly organize a separate pool of TC agents to run our kvserver
tests under the race detector?
I made the change you suggested, let's see how it goes. If this doesn't fly, I'll try with skip.UnderStressRace
.
pkg/kv/kvserver/replicate_queue_test.go, line 376 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: You can use
strconv.Itoa
here. Also, should we shift this up usingfor i := 1; i <= 5; i++
so that we don't need tostore-1
shift down below?
Done.
pkg/kv/kvserver/replicate_queue_test.go, line 417 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Doesn't this need to go inside of the
SucceedsSoon
?
yes, I was running into flakes without it. Fixed.
pkg/kv/kvserver/replicate_queue_test.go, line 422 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why would we expect 5x replication without setting zone configs? Does scratchRange end up being a meta range?
yeah I had mistakenly removed a line of code before pushing it out, the test fails as of the revision you reviewed. see now.
I changed it such that we now start with 1 voter and 4 non-voters and then move to 3 voters and 2 non-voters. this is so that we're also tickling the swapping behavior on the allocation side.
pkg/kv/kvserver/replicate_queue_test.go, line 430 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should this come after the
forceScanOnAllReplicationQueues
?
Yes, done.
pkg/kv/kvserver/replicate_queue_test.go, line 478 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
go through non-voter promotions and voter demotions.
"go through atomic non-voter promotions and voter demotions."
Give yourself some credit here 😄
Done.
pkg/kv/kvserver/store_pool.go, line 386 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Not your code, but this is kind of strange. Do we just rely on all other changeType's being no-ops? Do you think we should make that explicit then and
default: return
?
Done.
pkg/roachpb/metadata_replicas.go, line 548 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should these live right below
MakeReplicationChanges
?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 10 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)
pkg/kv/kvserver/allocator.go, line 799 at r3 (raw file):
options := a.scorerOptions() replicaSetForDiversityCalc := getReplicasForDiversityCalc(targetType, existingVoters, existingReplicas)
nit: move this down above rankedCandidateListForRemoval
like you did up above.
pkg/kv/kvserver/replicate_queue.go, line 926 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
I confusedly stared at the code for 15 mins before realizing that we're not actually falling through to call
changeReplicas
unlessok == true
. Notice that that call is in theelse
block. It's also not clear to me why we want to fall through to the call toshedLease
below, but I'm hesitating to mess with this stuff. What do you think?
Oh, I see what you're saying. I read it wrong. Feel free to remove the TODO.
bdd7f8f
to
f42f340
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvserver/allocator.go, line 799 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: move this down above
rankedCandidateListForRemoval
like you did up above.
Done, here and elsewhere.
pkg/kv/kvserver/allocator_test.go, line 3301 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: keep these fields in order in each test case.
Done.
f42f340
to
b594e44
Compare
This commit adds the necessary machinery within the allocator and the plumbing within the `replicateQueue` to be able to rebalance non-voting replicas. A few things to note are: 1. Voting replicas are allowed to rebalance to nodes that already have a non-voting replica. This will trigger essentially what is a metadata operation: flipping the replica type of the corresponding non-voter to a voter and the type of the voter to a non_voter. Notably, this PR changes voter allocation code to also be able to do this. 2. Computation of diversity scores works slightly differently for voting replicas and non-voting replicas. Non-voting replicas compute candidate diversity scores based off of all existing replicas, whereas voting replicas compute candidate diversity scores off of just the set of voting replicas. We don't yet support removal/replacement of dead and decommissioning non-voters, that is coming in a follow-up PR. Release justification: fix major limitation of non-voting replicas Release note: None
b594e44
to
a75a287
Compare
Thanks for the review! bors r+ |
Build succeeded: |
voters to non-voters This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that cockroachdb#58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See cockroachdb#61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None
voters to non-voters This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that cockroachdb#58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See cockroachdb#61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None
voters to non-voters This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that cockroachdb#58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See cockroachdb#61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None
voters to non-voters This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that cockroachdb#58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See cockroachdb#61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None
62631: kvserver: get rid of bespoke relocation logic used by the mergeQueue r=aayushshah15 a=aayushshah15 This PR contains two main commits: **kvserver: update `AdminRelocateRange` to leverage explicit swaps of voters to non-voters** This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that #58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See #61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None **kvserver: get rid of bespoke relocation logic used by the mergeQueue** This commit removes the relocation logic used by the `mergeQueue` thus far to align replica sets (added in #56197). This logic previously existed in order to allow us to align the replica sets of a pair of ranges (which is required for the range merge to proceed), while avoiding redundant data movement. Before #58627 and the previous commit in this PR, `AdminRelocateRange` couldn't be directly used by the mergeQueue under various configurations of the LHS and RHS ranges. Furthermore, even when it could be used, it would involve redundant data movement. This all required us to compute relocation targets for the RHS of a merge separately, above the call to `AdminRelocateRange`, for the range merge to proceed. All these limitations have been resolved by the previous commit which teaches `AdminRelocateRange` to promote non-voters and demote voters when needed, and the aforementioned bespoke relocation logic is no longer needed. Resolves #62370 Release note: None Co-authored-by: aayush <[email protected]> Co-authored-by: Aayush Shah <[email protected]>
voters to non-voters This commit updates `AdminRelocateRange` to use explicit atomic swaps of voting replicas with non-voting replicas, that cockroachdb#58627 initially added support for. The patch does so by generalizing behavior that's already exercised by the `replicateQueue` when it decides to rebalance replicas. See cockroachdb#61239. This allows us, in the next commit, to remove bespoke relocation logic that's used by the `mergeQueue` to align replica sets for the sake of a range merge. Release note: None
…mpts A bug was introduced in cockroachdb#61239 which made it such that non-voters could, in certain edge case scenarios, attempt to rebalance to stores that already had a voting replica for the same range. However, this bug wouldn't actually materialize into user observable symptoms because such rebalance decisions were rejected before the `replicateQueue` ever tried to execute them. See: https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271. This commit cleans this up and adds a test demonstrating one such edge case. Release note: None
…mpts A bug was introduced in cockroachdb#61239 which made it such that non-voters could, in certain edge case scenarios, attempt to rebalance to stores that already had a voting replica for the same range. However, this bug wouldn't actually materialize into user observable symptoms because such rebalance decisions were rejected before the `replicateQueue` ever tried to execute them. See: https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271. This commit cleans this up and adds a test demonstrating one such edge case. Release note: None
73657: ci: fix `teamcity-trigger` when running under `bazel` r=jlinder a=rickystewart `gotool.ImportPaths` can't find all the go packages when you query the typical way in the Bazel build container for whatever reason, but I've validated that `./...` works as a workaround. Closes #73643. Release note: None 73670: kvserver: fix bug that could lead to invalid non-voter rebalance attempts r=aayushshah15 a=aayushshah15 A bug was introduced in #61239 which made it such that non-voters could, in certain edge case scenarios, attempt to rebalance to stores that already had a voting replica for the same range. However, this bug wouldn't actually materialize into user observable symptoms because such rebalance decisions were rejected before the `replicateQueue` ever tried to execute them. See: https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271. This commit cleans this up and adds a test demonstrating one such edge case. Release note: None 73689: bazel: introduce a new `nonogo` config, update `dev` script r=jlinder a=rickystewart The introduction of `nogo` actually makes building the full `dev` binary rather heavyweight, which vastly increases the surface area of possible build failures. Add a new lightweight `nonogo` config and reference it in an error message so that people can debug their way around the issue. Close #73656. Release note: None Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Aayush Shah <[email protected]>
First commit from #60029
This PR adds the necessary machinery within the allocator and the
plumbing within the replicateQueue to be able to rebalance non-voting
replicas.
A few things to note are:
Voting replicas are allowed to rebalance to nodes that already have a
non-voting replica. This will trigger what is, essentially, a metadata
operation: flipping the replica type of the corresponding non-voter to a
voter and the type of the voter to a non_voter. Notably, this PR changes
voter allocation code to also be able to do this.
Computation of diversity scores works slightly differently for voting
replicas and non-voting replicas. Non-voting replicas compute candidate
diversity scores based off of all existing replicas, whereas voting
replicas compute candidate diversity scores off of just the set of
voting replicas.
This PR does not yet add support in the store rebalancer to make rebalancing
decisions for non-voting replicas and we don't yet support removal/
replacement of dead and decommissioning non-voters. These things are
coming in follow-up PRs.
Release justification: necessary for non-voting replicas
Release note: None