Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: fix up/downreplication status of non-voters in range report #60029

Merged
merged 1 commit into from
Mar 1, 2021

Conversation

aayushshah15
Copy link
Contributor

@aayushshah15 aayushshah15 commented Feb 9, 2021

This commit updates the calcRangeCounter() method to consider non-voting
replicas when determining whether a range is over or under-replicated.
Without this change, ranges that have any non-voting replicas would show
up as under-replicated on the DB console range report page.

Release justification: necessary for observability into non-voting replicas

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)


pkg/kv/kvserver/allocator.go, line 371 at r1 (raw file):

	}
	if need < 0 {
		return 0

minor nit: need = 0 to straighten out control flow and give this a comment like // Must be non-negative.

Or:

return max(
    min(
        zoneConfigNonVoterCount, 
        clusterNodes - numVoters, // cannot place on nodes with voters
    ), 
    0, // must be non-negative
)

pkg/kv/kvserver/replica_metrics.go, line 209 at r1 (raw file):

// replica is determined by checking its node in the provided liveness map.
func calcLiveNonVoterReplicas(desc *roachpb.RangeDescriptor, livenessMap liveness.IsLiveMap) int {
	var live int

Could we eliminate the duplication here with a helper:

func calcLiveVoterReplicas(desc *roachpb.RangeDescriptor, livenessMap liveness.IsLiveMap) int {
    return calcLiveReplicas(desc.Replicas().VoterDescriptors(), livenessMap)
}

func calcLiveNonVoterReplicas(desc *roachpb.RangeDescriptor, livenessMap liveness.IsLiveMap) int {
    ...
}

func calcLiveReplicas(replicas []roachpb.ReplicaDescriptor, livenessMap liveness.IsLiveMap) int {
	var live int
	for _, rd := range replicas {
		if livenessMap[rd.NodeID].IsLive {
			live++
		}
	}
	return live
}

pkg/kv/kvserver/replica_metrics_test.go, line 63 at r1 (raw file):

	{
		ctr, down, under, over := calcRangeCounter(11, desc, liveness.IsLiveMap{

Do we need tests where non-voting replicas result in underreplication and overreplication?

This commit updates the `calcRangeCounter()` method to consider non-voting
replicas when determining whether a range is over or under-replicated.
Without this change, ranges that have any non-voting replicas would show
up as under-replicated on the DB console range report page.

Release note: None

Release justification: fixes misreporting of replication status of a range
Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/kv/kvserver/allocator.go, line 371 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

minor nit: need = 0 to straighten out control flow and give this a comment like // Must be non-negative.

Or:

return max(
    min(
        zoneConfigNonVoterCount, 
        clusterNodes - numVoters, // cannot place on nodes with voters
    ), 
    0, // must be non-negative
)

done.


pkg/kv/kvserver/replica_metrics.go, line 209 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Could we eliminate the duplication here with a helper:

func calcLiveVoterReplicas(desc *roachpb.RangeDescriptor, livenessMap liveness.IsLiveMap) int {
    return calcLiveReplicas(desc.Replicas().VoterDescriptors(), livenessMap)
}

func calcLiveNonVoterReplicas(desc *roachpb.RangeDescriptor, livenessMap liveness.IsLiveMap) int {
    ...
}

func calcLiveReplicas(replicas []roachpb.ReplicaDescriptor, livenessMap liveness.IsLiveMap) int {
	var live int
	for _, rd := range replicas {
		if livenessMap[rd.NodeID].IsLive {
			live++
		}
	}
	return live
}

done.


pkg/kv/kvserver/replica_metrics_test.go, line 63 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we need tests where non-voting replicas result in underreplication and overreplication?

done.

craig bot pushed a commit that referenced this pull request Mar 1, 2021
60029: kvserver: fix up/downreplication status of non-voters in range report r=aayushshah15 a=aayushshah15

This commit updates the `calcRangeCounter()` method to consider non-voting
replicas when determining whether a range is over or under-replicated.
Without this change, ranges that have any non-voting replicas would show
up as under-replicated on the DB console range report page.

Release note: None

Co-authored-by: Aayush Shah <[email protected]>
@craig
Copy link
Contributor

craig bot commented Mar 1, 2021

Build failed:

@aayushshah15
Copy link
Contributor Author

It was missing a release justification.

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 1, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 1, 2021

Build succeeded:

@craig craig bot merged commit 61859f7 into cockroachdb:master Mar 1, 2021
craig bot pushed a commit that referenced this pull request Mar 4, 2021
61239: kvserver: handle range-level rebalancing of non-voting replicas r=aayushshah15 a=aayushshah15

First commit from #60029

This PR adds the necessary machinery within the allocator and the
plumbing within the replicateQueue to be able to rebalance non-voting
replicas.

A few things to note are:

Voting replicas are allowed to rebalance to nodes that already have a
non-voting replica. This will trigger what is, essentially, a metadata
operation: flipping the replica type of the corresponding non-voter to a
voter and the type of the voter to a non_voter. Notably, this PR changes
voter allocation code to also be able to do this.

Computation of diversity scores works slightly differently for voting
replicas and non-voting replicas. Non-voting replicas compute candidate
diversity scores based off of all existing replicas, whereas voting
replicas compute candidate diversity scores off of just the set of
voting replicas.

This PR does not yet add support in the store rebalancer to make rebalancing
decisions for non-voting replicas and we don't yet support removal/
replacement of dead and decommissioning non-voters. These things are
coming in follow-up PRs.

Release justification: necessary for non-voting replicas

Release note: None


61373: kvserver, timeutil: fix some Timer user-after-Stops r=andreimatei a=andreimatei

Two guys were continuing to use a Timer after Stop()ing it, which is
illegal.

Release note: None
Release justification: Bug fix.

Co-authored-by: Aayush Shah <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants