Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: rebalancing only considers range leader store #5737

Closed
petermattis opened this issue Mar 30, 2016 · 3 comments
Closed

storage: rebalancing only considers range leader store #5737

petermattis opened this issue Mar 30, 2016 · 3 comments
Assignees
Milestone

Comments

@petermattis
Copy link
Collaborator

Rebalancing is performed by the "replicate queue" (storage/replicate_queue.go) which periodically checks every range it is a leader for and calls Allocator.ShouldRebalance. ShouldRebalance retrieves the store descriptor for the leader replica and calls balancer.improve which compares the store capacity/range-count against cluster averages. If rebalancing should be performed, a new replica is added and a subsequent pass of the "replicate queue" will take care of removing the "worst" replica in an over-replicated range.

There is a glitch in the above, though. If the leader replica for a range is on a "good" store while one of the other replicas is on a "bad" store, we won't perform rebalancing. balancer.improve should consider the worst replica store when making the rebalancing decision.

@mrtracy
Copy link
Contributor

mrtracy commented Apr 5, 2016

I don't think there's any reason for the replicate queue to run exclusively on leaders. Everything it does is already safe against races.

In fact, I already thought this was the case (that replicate_queue was considering non-leaders).

@bdarnell
Copy link
Contributor

bdarnell commented Apr 5, 2016

It's safe to run replicate_queue on followers, but it's a bit wasteful for two nodes to try to make ChangeReplicas transactions simultaneously. I think it would be best for the replication queue to continue to run on the leader but to consider the health of all replicas (assuming that information is available at the leader).

Allocator.ShouldRebalance is called for each replica in shouldQueue. So as long as the store is the leader of some ranges, it will find opportunities for rebalancing. However, even though this helps avoid the problem pointed out in this issue, it's not ideal because it's more disruptive to rebalance a range away from its leader than from a follower. So it would be be better to do the opposite and prefer to find rebalancing opportunities in follower replicas.

@petermattis petermattis added this to the Q2 milestone Apr 11, 2016
@petermattis petermattis modified the milestones: Q2, Q3 Jul 11, 2016
@bdarnell bdarnell assigned BramGruneir and unassigned cuongdo Jul 18, 2016
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Jul 19, 2016
This is the preventative measure for stopping a down-replication of the range
lease holder's replica. But I'd like to add either a mechanism to suggest
giving up the lease when this occurs, or return the 2nd worst replica as well
so that one can be removed instead.

Part of cockroachdb#5737
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Jul 19, 2016
This is the preventative measure for stopping a down-replication of the range
lease holder's replica.  RemoveTarget will pick the worst replica from the list
that does not contain the lease holder's replica.

Part of cockroachdb#5737
BramGruneir added a commit to BramGruneir/cockroach that referenced this issue Jul 19, 2016
This is the preventative measure for stopping a down-replication of the range
lease holder's replica.  RemoveTarget will pick the worst replica from the list
that does not contain the lease holder's replica.

Part of cockroachdb#5737
@BramGruneir
Copy link
Member

There are no plans right now to consider moving the range lease or to allow all replicas to run via the replicate queue. I think we can consider this closed for now.

Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants