-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connect: deployments should wait for Connect sidecar checks #19334
Conversation
When a Connect service is registered with Consul, Nomad includes the nested `Connect.SidecarService` field that includes health checks for the Envoy proxy. Because these are not part of the job spec, the alloc health tracker created by `health_hook` doesn't know to read the value of these checks. In many circumstances this won't be noticed, but if the Envoy health check happens to take longer than the `update.min_healthy_time` (perhaps because it's been set low), it's possible for a deployment to progress too early such that there will briefly be no healthy instances of the service available in Consul. Update the Consul service client to find the nested sidecar service in the service catalog and attach it to the results provided to the tracker. The tracker can then check the sidecar health checks. Fixes: #19269
f2810d9
to
d9c9dc5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one lil suggestion, but either way, LGTM!
I'll add the backport labels on this once 1.7.0 GA goes out. |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
When a Connect service is registered with Consul, Nomad includes the nested
Connect.SidecarService
field that includes health checks for the Envoy proxy. Because these are not part of the job spec, the alloc health tracker created byhealth_hook
doesn't know to read the value of these checks.In many circumstances this won't be noticed, but if the Envoy health check happens to take longer than the
update.min_healthy_time
(perhaps because it's been set low), it's possible for a deployment to progress too early such that there will briefly be no healthy instances of the service available in Consul.Update the Consul service client to find the nested sidecar service in the service catalog and attach it to the results provided to the tracker. The tracker can then check the sidecar health checks.
Fixes: #19269