-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation needed for cloud-cluster case #5418
Comments
After you bootstrap an etcd cluster, any new (without etcd data) etcd member that wants to join into the bootstrapped cluster should always set this to
It only matters for the bootstrap case. So both way you mentioned works. After bootstrap, the cluster size is controlled by external tools explicitly or human via etcd reconfiguration API.
I do not follow this question. Can you explain more?
If you can use DNS and can easily manage it, then great! Use DNS for peer url. Then as long as the data still exists, machine replacement will not involve any reconfiguration. You only need to change DNS record somehow.
Most of the cases (when you do not lose data), human intervention is not required. If you lose your data, then the etcd member is lost. Additional work is required. etcd reconfiguration API is programmable, so you can write program against it. |
@justinsb Does this answer your questions? |
Sorry for delay in replying, and thanks for confirming that repointing DNS works for when we replace nodes. On the first two questions, I'm not entirely clear on how to bootstrap a cluster then. What I will do is have 3 EBS volumes (in 3 AZs), and 3 DNS names (etcd0, etcd1, etcd2). I can statically configure each etcd with ETCD_INITIAL_CLUSTER for the 3 nodes (ETCD_INITIAL_CLUSTER=etcd0,etcd1,etcd2). Then I will arrange to attach the EBS volumes to the nodes and repoint the dns name (I likely won't be able to use k8s PetSets, but you can imagine that we are using k8s PetSets). So how do I set ETCD_INITIAL_CLUSTER_STATE in this scenario? And does setting ETCD_INITIAL_CLUSTER=etcd0,etc1,etcd2 mean that the cluster will only initialize once a quorum of members (2 nodes) comes online? (I'm pretty sure that's what I want.) |
For static bootstrapping, always set this to new (default is new, so you do not need to set it).
Yes. No writes will go in until the quorum is up. |
@justinsb Does my reply answer your question? |
@justinsb I am closing this due to low activity. Reopen if you have a follow-up. |
Sorry for delay & thanks! I can confirm that this setup does work perfectly (so far). I'm working on more testing on documentation, but thanks for the pointers. |
It isn't entirely clear how team-etcd would recommend running a cluster on AWS or GCE or other clouds, where we have things like easily programmable DNS & persistent-volumes. See for example kubernetes/kubernetes#19443
@philips suggested that it would be possible to run N instances with N persistent volumes, and to repoint DNS instead of performing cluster replacement (for normal operation).
It would be great to get some documentation for the "recommended" way of doing things, in particular I had these 4 questions on the DNS approach: kubernetes/kubernetes#19443 (comment)
Copying those 4 questions here:
In any case, the docs are great for the bare-metal case, where operator intervention is required to replace cluster members, but it would be great if they also covered the programmable-infrastructure case where we can hopefully auto-recover from most/many failure scenarios.
The text was updated successfully, but these errors were encountered: