-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Leader Election for HA Mode #135
Comments
Thanks @aminmr. I am not sure about this. k8s-cleaner originally had leader election. But I removed it. Reason being, I believe with cleaner we might hit scaling more than availability (and the end it does not have to respond to other services). And when we hit scaling, my plan is to introduce sharding. So different cleaner instances can process in parallel different cleaner instances based on some annotation. Let's keep this open though. If we don't take that path, we will add leader election back. Thank you again! |
Thanks, @gianlucam76 ! I appreciate your explanation, but I have a couple of questions regarding the future implementation and design decisions for the Cleaner. Could you clarify why leader election isn't a good solution for this project? Thanks again for your time, and I look forward to your thoughts on this! |
Hi @aminmr regarding sharding, I am planning on using same approach I used in sveltos here it will require some manual configuration (as I don't have a shard controller like in Sveltos), but the idea is that. In general leader election is great (tough it has a cost of you having to run 3 pods instead of 1 with 2 pods doing nothing most of the time). But I do see that more valuable for a service that need to respond to other services (where you cannot afford having it down for 30 seconds or so). |
Description
Currently, the k8s-cleaner operator is deployed with a single replica by default and does not support high availability (HA). If the number of replicas is increased, multiple pods may attempt to take actions simultaneously, which could lead to conflicts or redundant operations on the cluster.
To address this, I suggest implementing leader election for HA mode. Many open-source projects utilize Kubernetes' Lease mechanism for this purpose, allowing only one pod to act as the leader at any given time. This would prevent multiple instances from interfering with each other when multiple replicas are running.
Here are the relevant Kubernetes docs: Kubernetes Lease Mechanism.
Proposed Solution:
Introduce leader election logic using Kubernetes Leases.
Ensure that only the pod holding the active lease performs actions on the cluster, while other replicas remain in a standby mode.
I am happy to volunteer to implement this feature for the project.
I am looking forward to your thoughts! Thanks!
The text was updated successfully, but these errors were encountered: