Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cluster] Auto rebalance #4052

Closed
shaharmor opened this issue Jun 13, 2017 · 4 comments
Closed

[Cluster] Auto rebalance #4052

shaharmor opened this issue Jun 13, 2017 · 4 comments

Comments

@shaharmor
Copy link
Contributor

Copied from #3009

On node join/leave/fail, the cluster should automatically reallocate unallocated slots to other masters in the cluster.

Specifically:

  • On join, the new master should get an even share of the available slots.
  • On leave, the old master should rebalance its own slots evenly among all other masters before actually leaving the cluster.
  • On fail, there are two things to consider:
    • If the master has any slaves: the slave will take control over its slots, and a reshard of master/slave status for the cluster might happen (See next section).
    • Master is slave-less: The now-missing slots should get automatically reallocated evenly between the remaining masters in the cluster.
@jcstanaway
Copy link

On leave, the old master should rebalance its own slots evenly among all other masters before actually leaving the cluster.

This seems to imply a controlled exit. What about unplanned scenarios? The old master won't be able to initiate a rebalance. A scenario of concern is that the server failed and the master is gone. In the event of a slave-less master, the remaining masters should - after a configurable time period - trigger a rebalance. The time period is important as depending on the deployment environment (e.g., Kubernetes), the master could recover quickly enough where a rebalance shouldn't be performed (and "quickly" is subjective, hence configurable).

@BarthV
Copy link

BarthV commented Jun 29, 2018

+1 for this feature !
About @ccs018 post, leave operation should be announced by the leaving node itself and specifically handled by the rest of the cluster (like nodetool decommission CLI command for Cassandra ...).

Another vision for this topic can be to simply refuse to implement this feature and only allow the cluster manager to handle the slot rebalancing & data reshuffling. Currently the cluster manager (redis-trib.rb or any third party cluster manager) is a "one shot" CLI command but we can imagine that, in the future, it will be a long run stateless application that would expose a REST API to handle operations. This is more Kubernetes-compliant vision, as this cluster manager could be integrated in a CRD (a.k.a Operator).

@shaharmor
Copy link
Contributor Author

I think this is something that can now be implemented using Redis Modules, with the new timers & cluster module support

@madolson
Copy link
Contributor

Closing as duplicate in favor of #3009.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants