-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
force new cluster, learner to leader #13213
force new cluster, learner to leader #13213
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Do you mind explaining what you meant by the following:
when the learner passes new_force_cluster, it happens.
What are some of the downsides of adding this?
@@ -646,5 +643,50 @@ func createConfigChangeEnts(lg *zap.Logger, ids []uint64, self uint64, term, ind | |||
next++ | |||
} | |||
|
|||
promoteNodeFunc := func(id uint64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a big change to me, I would expect some tests at least for this.
I want to promote learner to leader when the other nodes all down. So I use --force-new-cluster when restart the learner. I hope the role of the node from learner to leader, and it runs ok. But when I restart the node I accept a panic.
But follower can run this commond ok, so I think it's a bug here. |
I have 4 shell scripts, maybe it can help you test it.
3.learner_force_new_cluster.sh
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
When the business has cross-computer room disaster recovery requirements for ETCD, we hope to build an ETCD cluster in one computer room, and set up a separate Learner node in another computer room as a cross-computer room disaster recovery node. When the host room fails, we can manually set the learner The node is forcibly upgraded to a leader node to provide services to the business and ensure the high availability of the entire cluster in the event of a computer room-level failure.Of course, the reduction of data consistency is acceptable for the business.
But when we use the ETCD3.5 version to do the solution, we found that when the leader and follower nodes are down, when the learner passes new_force_cluster, it happens.
This panic prevented me from forcibly promoting learner to leader to complete disaster recovery across computer rooms.