You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
NGINX Ingress controller version: 0.26.1
Kubernetes version (use kubectl version): v1.13.11-gke.9
Environment:
Cloud provider or hardware configuration: GKE
What happened:
When scaling the Nginx pods from one to 3 or more replicas the last pod would get stuck in a crash loop (3 replicas would have 2 working pods, and 4 replicas would have 3 working pods). The logs of the faulty pod would show:
When scaling the pods, If Nginx is failing to pick up the /configuration/backends I would expect either all the pods to fail or none of them.
How to reproduce it (as minimally and precisely as possible):
I was unable to reproduce the issue, before being able to troubleshoot in detail the next morning the pods started correctly.
NAME READY STATUS RESTARTS AGE
nginx-ingress-controller-services-5f8777589b-g7fql 1/1 Running 0 80m
nginx-ingress-controller-services-5f8777589b-pdntg 1/1 Running 0 81m
nginx-ingress-controller-services-5f8777589b-xqg2j 1/1 Running 287 18h
This is the moment the faulty pod(287 restarts) became healthy.
W 2019-11-06T08:55:09.792863Z Dynamic reconfiguration failed: Post http://127.0.0.1:10246/configuration/backends: dial tcp 127.0.0.1:10246: connect: connection refused
E 2019-11-06T08:55:09.792902Z Unexpected failure reconfiguring NGINX:
W 2019-11-06T08:55:09.792912Z requeuing initial-sync, err Post http://127.0.0.1:10246/configuration/backends: dial tcp 127.0.0.1:10246: connect: connection refused
I 2019-11-06T08:55:09.793143Z Configuration changes detected, backend reload required.
I 2019-11-06T08:55:12.791405Z Backend successfully reloaded.
I 2019-11-06T08:55:12.791461Z Initial sync, sleeping for 1 second.
No changes had been made to the Nginx deployment during this time period, but one of the service backends Nginx was picking up had a faulty docker tag. The period when we fixed it corresponds to the time the Nginx pod became healthy.
I tried redeploying it by the bad docker tag and got
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
NGINX Ingress controller version: 0.26.1
Kubernetes version (use
kubectl version
): v1.13.11-gke.9Environment:
What happened:
When scaling the Nginx pods from one to 3 or more replicas the last pod would get stuck in a crash loop (3 replicas would have 2 working pods, and 4 replicas would have 3 working pods). The logs of the faulty pod would show:
Followed by the
SIGTERM
What you expected to happen:
When scaling the pods, If Nginx is failing to pick up the
/configuration/backends
I would expect either all the pods to fail or none of them.How to reproduce it (as minimally and precisely as possible):
I was unable to reproduce the issue, before being able to troubleshoot in detail the next morning the pods started correctly.
This is the moment the faulty pod(287 restarts) became healthy.
No changes had been made to the Nginx deployment during this time period, but one of the service backends Nginx was picking up had a faulty docker tag. The period when we fixed it corresponds to the time the Nginx pod became healthy.
I tried redeploying it by the bad docker tag and got
But the
Backend successfully reloaded.
and I was unable to reproduce the issue.Anything else we need to know:
Anything else we can view to further diagnose or reproduce this type of issue?
The text was updated successfully, but these errors were encountered: