storage: rebalancing only considers range leader store #5737

petermattis · 2016-03-30T20:30:45Z

Rebalancing is performed by the "replicate queue" (storage/replicate_queue.go) which periodically checks every range it is a leader for and calls Allocator.ShouldRebalance. ShouldRebalance retrieves the store descriptor for the leader replica and calls balancer.improve which compares the store capacity/range-count against cluster averages. If rebalancing should be performed, a new replica is added and a subsequent pass of the "replicate queue" will take care of removing the "worst" replica in an over-replicated range.

There is a glitch in the above, though. If the leader replica for a range is on a "good" store while one of the other replicas is on a "bad" store, we won't perform rebalancing. balancer.improve should consider the worst replica store when making the rebalancing decision.

The text was updated successfully, but these errors were encountered:

mrtracy · 2016-04-05T17:20:06Z

I don't think there's any reason for the replicate queue to run exclusively on leaders. Everything it does is already safe against races.

In fact, I already thought this was the case (that replicate_queue was considering non-leaders).

bdarnell · 2016-04-05T17:42:19Z

It's safe to run replicate_queue on followers, but it's a bit wasteful for two nodes to try to make ChangeReplicas transactions simultaneously. I think it would be best for the replication queue to continue to run on the leader but to consider the health of all replicas (assuming that information is available at the leader).

Allocator.ShouldRebalance is called for each replica in shouldQueue. So as long as the store is the leader of some ranges, it will find opportunities for rebalancing. However, even though this helps avoid the problem pointed out in this issue, it's not ideal because it's more disruptive to rebalance a range away from its leader than from a follower. So it would be be better to do the opposite and prefer to find rebalancing opportunities in follower replicas.

This is the preventative measure for stopping a down-replication of the range lease holder's replica. But I'd like to add either a mechanism to suggest giving up the lease when this occurs, or return the 2nd worst replica as well so that one can be removed instead. Part of cockroachdb#5737

This is the preventative measure for stopping a down-replication of the range lease holder's replica. RemoveTarget will pick the worst replica from the list that does not contain the lease holder's replica. Part of cockroachdb#5737

BramGruneir · 2016-07-25T15:20:07Z

There are no plans right now to consider moving the range lease or to allow all replicas to run via the replicate queue. I think we can consider this closed for now.

Closing this issue.

petermattis assigned cuongdo Mar 31, 2016

petermattis added this to the Q2 milestone Apr 11, 2016

bdarnell mentioned this issue May 7, 2016

RFC: Rebalancing V2 #6484

Merged

petermattis modified the milestones: Q2, Q3 Jul 11, 2016

bdarnell assigned BramGruneir and unassigned cuongdo Jul 18, 2016

BramGruneir mentioned this issue Jul 19, 2016

storage: don't down-replicate the range holding range lease #7918

Merged

BramGruneir closed this as completed Jul 25, 2016

bdarnell mentioned this issue Sep 20, 2016

storage: transfer leases on overfull stores #9465

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: rebalancing only considers range leader store #5737

storage: rebalancing only considers range leader store #5737

petermattis commented Mar 30, 2016

mrtracy commented Apr 5, 2016

bdarnell commented Apr 5, 2016

BramGruneir commented Jul 25, 2016

storage: rebalancing only considers range leader store #5737

storage: rebalancing only considers range leader store #5737

Comments

petermattis commented Mar 30, 2016

mrtracy commented Apr 5, 2016

bdarnell commented Apr 5, 2016

BramGruneir commented Jul 25, 2016