-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ping-Pong service management in canary updates #1453
Comments
Found another window where rollouts Pods never become ready, when dealing with readiness gate:
|
Hi @jessesuen Example of the experiment with canary:
|
We just talked with @harikrongali about this and see no issue here. |
Summary
Spawning this from #1283.
AWS Load Balancer Controller (and possibly others), suffer from an issue where modifying selectors of Services behind an AWS Ingress is problematic, because changing Service selectors prevents readiness gates to be injected properly. This highly dependent on the ingress controller implementation, but with AWS Load Balancer Controller v2.x, the controller will only inject pod readiness gates if services are reachable by an AWS Ingress. Pods are considered reachable by an ingress if they match labels of an Ingress/Service at the time of pod creation. See more in-depth description of the issue here:
https://argoproj.github.io/argo-rollouts/features/traffic-management/alb/#zero-downtime-updates-with-aws-targetgroup-verification.
The v1.1 Target Group verification feature was implemented to provide zero-downtime guarantees in the absence of proper readiness gate injection, and it should satisfy any concerns of zero-downtime. But in case the feature is not used, Argo Rollouts could offer a model where selectors of services are modified in a way that works well with ALB LoadBalancer Controller's implementation of readiness gate injection.
This proposal is for Rollouts to provide a second option of another way of canarying, which alternates sending traffic between a "ping" service and a "pong" service (both managed & deployed by the user). On every update, the rollout controller would leverage weighted target groups to update the Ingress annotations to split production traffic from the ping and pong (and vice versa on the next update). With this approach, readiness gates would be injected properly, because we would only ever modify the ping/pong service selectors before the pods were created.
Some slides that detail the problem and approach to solving it:
https://docs.google.com/presentation/d/1JnvlE-oKL7HPErwFnBBhH2pfWUf0kSoFRLUDt2Glc6E/edit#slide=id.ge7a629063e_1_451
Proposed spec:
I am open to a better name than ping/pong service
Use Cases
When would you use this?
I use AWS Load Balancer Controller with Rollouts' AWS integration, and I want readiness gates to be injected properly.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: