Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: handle range-level rebalancing of non-voting replicas #61239

Merged

Conversation

aayushshah15
Copy link
Contributor

First commit from #60029

This PR adds the necessary machinery within the allocator and the
plumbing within the replicateQueue to be able to rebalance non-voting
replicas.

A few things to note are:

Voting replicas are allowed to rebalance to nodes that already have a
non-voting replica. This will trigger what is, essentially, a metadata
operation: flipping the replica type of the corresponding non-voter to a
voter and the type of the voter to a non_voter. Notably, this PR changes
voter allocation code to also be able to do this.

Computation of diversity scores works slightly differently for voting
replicas and non-voting replicas. Non-voting replicas compute candidate
diversity scores based off of all existing replicas, whereas voting
replicas compute candidate diversity scores off of just the set of
voting replicas.

This PR does not yet add support in the store rebalancer to make rebalancing
decisions for non-voting replicas and we don't yet support removal/
replacement of dead and decommissioning non-voters. These things are
coming in follow-up PRs.

Release justification: necessary for non-voting replicas

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aayushshah15
Copy link
Contributor Author

Apologies for taking so damn long to get this out.

@aayushshah15 aayushshah15 force-pushed the allocator-rebalance-swap-non-voters branch 5 times, most recently from 4bd8c05 to 4b1c5ca Compare February 28, 2021 23:32
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! It's exciting to see the tests you're adding alongside the logic improvements which demonstrate the growing intelligence of the allocator and replicateQueue.

Reviewed 9 of 9 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15)


pkg/kv/kvserver/allocator.go, line 576 at r2 (raw file):

}

func getReplicaSetForDiversityCalc(

Let's give this function a comment.


pkg/kv/kvserver/allocator.go, line 606 at r2 (raw file):

		panic(fmt.Sprintf("unsupported targetReplicaType: %v", t))
	}
	// Unreachable.

nit: you may not even need this. Go can detect when a switch statement is exhaustive in some cases. I think it's able to detect it for panics, but not for log.Fatals.


pkg/kv/kvserver/allocator.go, line 689 at r2 (raw file):

	var constraintsChecker constraintsCheckFn
	existingReplicaSet := getReplicaSetForDiversityCalc(targetType, existingVoters, existingReplicas)

nit: move this down, right above candidates := , since it's no longer computed in the switch block.


pkg/kv/kvserver/allocator.go, line 706 at r2 (raw file):

		candidateStores,
		constraintsChecker,
		// We'll consider the targets that have a non-voter as feasible

This comment checks out, but I don't understand its relation to this code. Would we be passing in a different replica set if this was not true? Does this comment belong somewhere else? For instance, above the call to getReplicaSetForDiversityCalc?


pkg/kv/kvserver/allocator.go, line 749 at r2 (raw file):

	case voterTarget:
		a.storePool.updateLocalStoreAfterRebalance(targetStore, rangeUsageInfo, roachpb.ADD_VOTER)
		defer func() {

nit here and below: can this just be defer a.storePool.updateLocalStoreAfterRebalance(targetStore, rangeUsageInfo, roachpb.REMOVE_VOTER)? We're not capturing any state that will change between the defer registration and the defer execution, right?


pkg/kv/kvserver/allocator.go, line 769 at r2 (raw file):

	}

	// Unreachable.

Similar comment. Let's panic in the default case and then get rid of this. The less code, the less chance we got something wrong and accidentally fall through into here.


pkg/kv/kvserver/allocator.go, line 1095 at r2 (raw file):

// replicas. This is in contrast to how we compute the diversity scores for
// voting replicas, which are computed relative to just the set of voting
// replicas.

Non-voters also only adhere to the overall constraints and not the voter-constraints, right?


pkg/kv/kvserver/allocator_scorer.go, line 642 at r2 (raw file):

			}
			if !cand.less(existing) {
				// If `cand` is not worse than `existing`, add it to the list.

This code looks familiar.


pkg/kv/kvserver/allocator_test.go, line 3251 at r2 (raw file):

// 1. Non-voter rebalancing obeys the allocator's capacity based heuristics.
// 2. Non-voter rebalancing tries to ensure constraints conformance.
func TestAllocatorRebalanceNonVoters(t *testing.T) {

Nice test!


pkg/kv/kvserver/allocator_test.go, line 3279 at r2 (raw file):

			name:                  "diversity among non-voters",
			stores:                multiDiversityDCStores,
			zone:                  zonepb.EmptyCompleteZoneConfig(),

I'm confused why non-voters aren't being removed if we're using an empty zone config that does not specify num_voters. Is this test case going to continue working after your future changes?


pkg/kv/kvserver/allocator_test.go, line 3301 at r2 (raw file):

			existingNonVoters: replicas(1),
			// NB: Store 1 has a 97.5% full disk.
			stores:                oneStoreWithFullDisk,

nit: keep these fields in order in each test case.


pkg/kv/kvserver/client_replica_test.go, line 2329 at r2 (raw file):

	// Ensure that these swaps do not work without atomic replication changes
	// enabled.
	_, err = tc.ServerConn(0).Exec(`SET CLUSTER SETTING kv.atomic_replication_changes.enabled = false`)

Do we need this anymore? Now that you've removed the setting?


pkg/kv/kvserver/replicate_queue.go, line 520 at r2 (raw file):

	}
	rq.metrics.AddReplicaCount.Inc(1)
	ops := roachpb.MakeReplicationChanges(roachpb.ADD_VOTER, newReplica)

We cant use roachpb.MakeReplicationChanges anymore?


pkg/kv/kvserver/replicate_queue.go, line 534 at r2 (raw file):

	rq.metrics.AddReplicaCount.Inc(1)

	var ops []roachpb.ReplicationChange

Give this block of code a comment. Something about determining whether the allocator is instructing us to promote an existing non_voter or to add an entirely new replica.


pkg/kv/kvserver/replicate_queue.go, line 557 at r2 (raw file):

		log.VEventf(ctx, 1, "replacing replica %s with %+v: %s",
			removeReplica, newReplica, rangeRaftProgress(repl.RaftStatus(), existingVoters))
		// NB: We may have performed a promotion of a non-voter above, but we will

Could you be a bit more explicit in this comment? Maybe just "but we will not perform a demotion here and will instead remove the voter entirely" to make sure readers are on the same page about what the code is doing.


pkg/kv/kvserver/replicate_queue.go, line 926 at r2 (raw file):

			storeFilterThrottled)
		if !ok {
			log.VEventf(ctx, 1, "no suitable rebalance target")

Why were we falling through and continuing to call changeReplicas in these error cases? I'm surprised we weren't just returning false, nil.


pkg/kv/kvserver/replicate_queue.go, line 1103 at r2 (raw file):

			chgs = append(promo, demo...)
		} else if found {
			log.Errorf(ctx, "programming error: store being rebalanced to(%s) already has a"+

Should we return an AssertionFailed error here?


pkg/kv/kvserver/replicate_queue.go, line 1123 at r2 (raw file):

			{ChangeType: roachpb.ADD_NON_VOTER, Target: addTarget},
			{ChangeType: roachpb.REMOVE_NON_VOTER, Target: removeTarget},
		}

Will there ever be a case where we demote a voter to a non-voter without a corresponding promotion?


pkg/kv/kvserver/replicate_queue_test.go, line 365 at r2 (raw file):

func TestReplicateQueueSwapVotersWithNonVoters(t *testing.T) {
	defer leaktest.AfterTest(t)()
	skip.UnderRace(t)

I'm a little concerned that we're skipping under race by default these days. The race detected can be very helpful and has uncovered numerous bugs! We should skip only as a last resort. How long does this test take under race?

Maybe we can drop the numIterations down under race instead.


pkg/kv/kvserver/replicate_queue_test.go, line 376 at r2 (raw file):

				Tiers: []roachpb.Tier{
					{
						Key: "rack", Value: fmt.Sprintf("%d", i),

nit: You can use strconv.Itoa here. Also, should we shift this up using for i := 1; i <= 5; i++ so that we don't need to store-1 shift down below?


pkg/kv/kvserver/replicate_queue_test.go, line 417 at r2 (raw file):

	scratchKey := tc.ScratchRange(t)
	scratchRange := tc.LookupRangeOrFatal(t, scratchKey)

Doesn't this need to go inside of the SucceedsSoon?


pkg/kv/kvserver/replicate_queue_test.go, line 422 at r2 (raw file):

			return err
		}
		if voters := scratchRange.Replicas().VoterDescriptors(); len(voters) != 5 {

Why would we expect 5x replication without setting zone configs? Does scratchRange end up being a meta range?


pkg/kv/kvserver/replicate_queue_test.go, line 430 at r2 (raw file):

	checkRelocated := func(t *testing.T, voterStores, nonVoterStores []roachpb.StoreID) {
		testutils.SucceedsSoon(t, func() error {
			scratchRange := tc.LookupRangeOrFatal(t, scratchKey)

Should this come after the forceScanOnAllReplicationQueues?


pkg/kv/kvserver/replicate_queue_test.go, line 478 at r2 (raw file):

go through non-voter promotions and voter demotions.

"go through atomic non-voter promotions and voter demotions."

Give yourself some credit here 😄


pkg/kv/kvserver/store_pool.go, line 386 at r2 (raw file):

		return
	}
	switch changeType {

Not your code, but this is kind of strange. Do we just rely on all other changeType's being no-ops? Do you think we should make that explicit then and default: return?


pkg/roachpb/metadata_replicas.go, line 548 at r2 (raw file):

}

// ReplicationChangesForPromotion returns the replication changes that

Should these live right below MakeReplicationChanges?

@aayushshah15 aayushshah15 force-pushed the allocator-rebalance-swap-non-voters branch 3 times, most recently from be26a0e to bdd7f8f Compare March 3, 2021 17:58
Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)


pkg/kv/kvserver/allocator.go, line 576 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's give this function a comment.

Done.


pkg/kv/kvserver/allocator.go, line 606 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: you may not even need this. Go can detect when a switch statement is exhaustive in some cases. I think it's able to detect it for panics, but not for log.Fatals.

Done.


pkg/kv/kvserver/allocator.go, line 689 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: move this down, right above candidates := , since it's no longer computed in the switch block.

Done.


pkg/kv/kvserver/allocator.go, line 706 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This comment checks out, but I don't understand its relation to this code. Would we be passing in a different replica set if this was not true? Does this comment belong somewhere else? For instance, above the call to getReplicaSetForDiversityCalc?

Moved it above the call to getReplicasForDiversityCalc


pkg/kv/kvserver/allocator.go, line 749 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit here and below: can this just be defer a.storePool.updateLocalStoreAfterRebalance(targetStore, rangeUsageInfo, roachpb.REMOVE_VOTER)? We're not capturing any state that will change between the defer registration and the defer execution, right?

Done.


pkg/kv/kvserver/allocator.go, line 769 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Similar comment. Let's panic in the default case and then get rid of this. The less code, the less chance we got something wrong and accidentally fall through into here.

Done.


pkg/kv/kvserver/allocator.go, line 1095 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Non-voters also only adhere to the overall constraints and not the voter-constraints, right?

Done.


pkg/kv/kvserver/allocator_scorer.go, line 642 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This code looks familiar.

🥘


pkg/kv/kvserver/allocator_test.go, line 3279 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm confused why non-voters aren't being removed if we're using an empty zone config that does not specify num_voters. Is this test case going to continue working after your future changes?

The decision of whether a replica needs to be added/removed/rebalanced will be made by Allocator.ComputeAction(). In the scenarios presented by these tests, we wouldn't expect ComputeAction to emit a rebalance action, but Rebalance{Non}Voter is only concerned about the Constraints and VoterConstraints from the zone config.


pkg/kv/kvserver/client_replica_test.go, line 2329 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we need this anymore? Now that you've removed the setting?

I hadn't rebased. Fixed now.


pkg/kv/kvserver/replicate_queue.go, line 520 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We cant use roachpb.MakeReplicationChanges anymore?

Fixed.


pkg/kv/kvserver/replicate_queue.go, line 534 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Give this block of code a comment. Something about determining whether the allocator is instructing us to promote an existing non_voter or to add an entirely new replica.

Done, lmk if it's lacking.


pkg/kv/kvserver/replicate_queue.go, line 557 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Could you be a bit more explicit in this comment? Maybe just "but we will not perform a demotion here and will instead remove the voter entirely" to make sure readers are on the same page about what the code is doing.

Done.


pkg/kv/kvserver/replicate_queue.go, line 926 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why were we falling through and continuing to call changeReplicas in these error cases? I'm surprised we weren't just returning false, nil.

I confusedly stared at the code for 15 mins before realizing that we're not actually falling through to call changeReplicas unless ok == true. Notice that that call is in the else block. It's also not clear to me why we want to fall through to the call to shedLease below, but I'm hesitating to mess with this stuff. What do you think?


pkg/kv/kvserver/replicate_queue.go, line 1103 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should we return an AssertionFailed error here?

not sure what I was thinking there. Fixed.


pkg/kv/kvserver/replicate_queue.go, line 1123 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Will there ever be a case where we demote a voter to a non-voter without a corresponding promotion?

In the current implementation, nope. For us to do a demotion without a corresponding promotion, we'd have to know that want to add a non-voter to a store that should lose a voter. Currently, the replicateQueue will just remove the voter first and then try to add a new non-voter in its next pass through. Of course, there's no reason this situation can't be improved. If we can determine that we're 1 non-voter short and that the removal target is a valid candidate for a non-voter, we can just do the demotion. I added a TODO for this.


pkg/kv/kvserver/replicate_queue_test.go, line 365 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm a little concerned that we're skipping under race by default these days. The race detected can be very helpful and has uncovered numerous bugs! We should skip only as a last resort. How long does this test take under race?

Maybe we can drop the numIterations down under race instead.

The frustrating thing about enabling these under race is that they often seem to be too slow on our overloaded CI machines (and often fail because these test servers cannot sustain node liveness), even though they're running with low parallelism. Could we possibly organize a separate pool of TC agents to run our kvserver tests under the race detector?

I made the change you suggested, let's see how it goes. If this doesn't fly, I'll try with skip.UnderStressRace.


pkg/kv/kvserver/replicate_queue_test.go, line 376 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: You can use strconv.Itoa here. Also, should we shift this up using for i := 1; i <= 5; i++ so that we don't need to store-1 shift down below?

Done.


pkg/kv/kvserver/replicate_queue_test.go, line 417 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Doesn't this need to go inside of the SucceedsSoon?

yes, I was running into flakes without it. Fixed.


pkg/kv/kvserver/replicate_queue_test.go, line 422 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why would we expect 5x replication without setting zone configs? Does scratchRange end up being a meta range?

yeah I had mistakenly removed a line of code before pushing it out, the test fails as of the revision you reviewed. see now.

I changed it such that we now start with 1 voter and 4 non-voters and then move to 3 voters and 2 non-voters. this is so that we're also tickling the swapping behavior on the allocation side.


pkg/kv/kvserver/replicate_queue_test.go, line 430 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this come after the forceScanOnAllReplicationQueues?

Yes, done.


pkg/kv/kvserver/replicate_queue_test.go, line 478 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
go through non-voter promotions and voter demotions.

"go through atomic non-voter promotions and voter demotions."

Give yourself some credit here 😄

Done.


pkg/kv/kvserver/store_pool.go, line 386 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Not your code, but this is kind of strange. Do we just rely on all other changeType's being no-ops? Do you think we should make that explicit then and default: return?

Done.


pkg/roachpb/metadata_replicas.go, line 548 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should these live right below MakeReplicationChanges?

Done.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)


pkg/kv/kvserver/allocator.go, line 799 at r3 (raw file):

	options := a.scorerOptions()

	replicaSetForDiversityCalc := getReplicasForDiversityCalc(targetType, existingVoters, existingReplicas)

nit: move this down above rankedCandidateListForRemoval like you did up above.


pkg/kv/kvserver/replicate_queue.go, line 926 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

I confusedly stared at the code for 15 mins before realizing that we're not actually falling through to call changeReplicas unless ok == true. Notice that that call is in the else block. It's also not clear to me why we want to fall through to the call to shedLease below, but I'm hesitating to mess with this stuff. What do you think?

Oh, I see what you're saying. I read it wrong. Feel free to remove the TODO.

@aayushshah15 aayushshah15 force-pushed the allocator-rebalance-swap-non-voters branch from bdd7f8f to f42f340 Compare March 3, 2021 21:56
Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @nvanbenschoten)


pkg/kv/kvserver/allocator.go, line 799 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: move this down above rankedCandidateListForRemoval like you did up above.

Done, here and elsewhere.


pkg/kv/kvserver/allocator_test.go, line 3301 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: keep these fields in order in each test case.

Done.

@aayushshah15 aayushshah15 force-pushed the allocator-rebalance-swap-non-voters branch from f42f340 to b594e44 Compare March 3, 2021 22:28
This commit adds the necessary machinery within the allocator and the
plumbing within the `replicateQueue` to be able to rebalance non-voting
replicas.

A few things to note are:
1. Voting replicas are allowed to rebalance to nodes that already have a
non-voting replica. This will trigger essentially what is a metadata
operation: flipping the replica type of the corresponding non-voter to a
voter and the type of the voter to a non_voter. Notably, this PR changes
voter allocation code to also be able to do this.

2. Computation of diversity scores works slightly differently for voting
replicas and non-voting replicas. Non-voting replicas compute candidate
diversity scores based off of all existing replicas, whereas voting
replicas compute candidate diversity scores off of just the set of
voting replicas.

We don't yet support removal/replacement of dead and decommissioning
non-voters, that is coming in a follow-up PR.

Release justification: fix major limitation of non-voting replicas

Release note: None
@aayushshah15 aayushshah15 force-pushed the allocator-rebalance-swap-non-voters branch from b594e44 to a75a287 Compare March 3, 2021 23:33
@aayushshah15
Copy link
Contributor Author

Thanks for the review!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 4, 2021

Build succeeded:

@craig craig bot merged commit 37cdf01 into cockroachdb:master Mar 4, 2021
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Mar 26, 2021
voters to non-voters

This commit updates `AdminRelocateRange` to use explicit atomic swaps of
voting replicas with non-voting replicas, that cockroachdb#58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the `replicateQueue` when it decides to rebalance replicas.
See cockroachdb#61239.

This allows us, in the next commit, to remove bespoke relocation logic
that's used by the `mergeQueue` to align replica sets for the sake of a
range merge.

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Mar 28, 2021
voters to non-voters

This commit updates `AdminRelocateRange` to use explicit atomic swaps of
voting replicas with non-voting replicas, that cockroachdb#58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the `replicateQueue` when it decides to rebalance replicas.
See cockroachdb#61239.

This allows us, in the next commit, to remove bespoke relocation logic
that's used by the `mergeQueue` to align replica sets for the sake of a
range merge.

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Mar 29, 2021
voters to non-voters

This commit updates `AdminRelocateRange` to use explicit atomic swaps of
voting replicas with non-voting replicas, that cockroachdb#58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the `replicateQueue` when it decides to rebalance replicas.
See cockroachdb#61239.

This allows us, in the next commit, to remove bespoke relocation logic
that's used by the `mergeQueue` to align replica sets for the sake of a
range merge.

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Mar 29, 2021
voters to non-voters

This commit updates `AdminRelocateRange` to use explicit atomic swaps of
voting replicas with non-voting replicas, that cockroachdb#58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the `replicateQueue` when it decides to rebalance replicas.
See cockroachdb#61239.

This allows us, in the next commit, to remove bespoke relocation logic
that's used by the `mergeQueue` to align replica sets for the sake of a
range merge.

Release note: None
craig bot pushed a commit that referenced this pull request Mar 29, 2021
62631: kvserver: get rid of bespoke relocation logic used by the mergeQueue r=aayushshah15 a=aayushshah15

This PR contains two main commits:

**kvserver: update `AdminRelocateRange` to leverage explicit swaps of
voters to non-voters**

This commit updates `AdminRelocateRange` to use explicit atomic swaps of
voting replicas with non-voting replicas, that #58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the `replicateQueue` when it decides to rebalance replicas.
See #61239.

This allows us, in the next commit, to remove bespoke relocation logic
that's used by the `mergeQueue` to align replica sets for the sake of a
range merge.

Release note: None

**kvserver: get rid of bespoke relocation logic used by the mergeQueue**

This commit removes the relocation logic used by the `mergeQueue` thus
far to align replica sets (added in #56197). This logic previously
existed in order to allow us to align the replica sets of a pair of
ranges (which is required for the range merge to proceed), while
avoiding redundant data movement.

Before #58627 and the previous commit in this PR, `AdminRelocateRange`
couldn't be directly used by the mergeQueue under various configurations
of the LHS and RHS ranges. Furthermore, even when it could be used, it
would involve redundant data movement. This all required us to compute
relocation targets for the RHS of a merge separately, above the call to
`AdminRelocateRange`, for the range merge to proceed.

All these limitations have been resolved by the previous commit which
teaches `AdminRelocateRange` to promote non-voters and demote voters
when needed, and the aforementioned bespoke relocation logic is no
longer needed.

Resolves #62370

Release note: None


Co-authored-by: aayush <[email protected]>
Co-authored-by: Aayush Shah <[email protected]>
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Mar 29, 2021
voters to non-voters

This commit updates `AdminRelocateRange` to use explicit atomic swaps of
voting replicas with non-voting replicas, that cockroachdb#58627 initially added
support for. The patch does so by generalizing behavior that's already
exercised by the `replicateQueue` when it decides to rebalance replicas.
See cockroachdb#61239.

This allows us, in the next commit, to remove bespoke relocation logic
that's used by the `mergeQueue` to align replica sets for the sake of a
range merge.

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Dec 10, 2021
…mpts

A bug was introduced in cockroachdb#61239
which made it such that non-voters could, in certain edge case scenarios,
attempt to rebalance to stores that already had a voting replica for the same
range. However, this bug wouldn't actually materialize into user observable
symptoms because such rebalance decisions were rejected before the
`replicateQueue` ever tried to execute them. See:
https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271.
This commit cleans this up and adds a test demonstrating one such edge case.

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Dec 10, 2021
…mpts

A bug was introduced in cockroachdb#61239
which made it such that non-voters could, in certain edge case scenarios,
attempt to rebalance to stores that already had a voting replica for the same
range. However, this bug wouldn't actually materialize into user observable
symptoms because such rebalance decisions were rejected before the
`replicateQueue` ever tried to execute them. See:
https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271.
This commit cleans this up and adds a test demonstrating one such edge case.

Release note: None
craig bot pushed a commit that referenced this pull request Dec 10, 2021
73657: ci: fix `teamcity-trigger` when running under `bazel` r=jlinder a=rickystewart

`gotool.ImportPaths` can't find all the go packages when you query the
typical way in the Bazel build container for whatever reason, but I've
validated that `./...` works as a workaround.

Closes #73643.

Release note: None

73670: kvserver: fix bug that could lead to invalid non-voter rebalance attempts r=aayushshah15 a=aayushshah15

A bug was introduced in #61239
which made it such that non-voters could, in certain edge case scenarios,
attempt to rebalance to stores that already had a voting replica for the same
range. However, this bug wouldn't actually materialize into user observable
symptoms because such rebalance decisions were rejected before the
`replicateQueue` ever tried to execute them. See:
https://github.com/cockroachdb/cockroach/blob/8bcd038ab6b76f6bf5894f2ffbbdbf10108b4c32/pkg/kv/kvserver/replicate_queue.go#L1257-L1271.
This commit cleans this up and adds a test demonstrating one such edge case.

Release note: None

73689: bazel: introduce a new `nonogo` config, update `dev` script r=jlinder a=rickystewart

The introduction of `nogo` actually makes building the full `dev` binary
rather heavyweight, which vastly increases the surface area of possible
build failures. Add a new lightweight `nonogo` config and reference it
in an error message so that people can debug their way around the issue.

Close #73656.

Release note: None

Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Aayush Shah <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants