Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: istio injected access pods fail to pass istio-validation #27

Closed
diranged opened this issue Nov 24, 2022 · 0 comments · Fixed by #35
Closed

bug: istio injected access pods fail to pass istio-validation #27

diranged opened this issue Nov 24, 2022 · 0 comments · Fixed by #35
Labels
bug Something isn't working

Comments

@diranged
Copy link
Owner

https://istio.slack.com/archives/C37A4KAAD/p1669246550291849

Hey... We're seeing an odd behavior when we use a custom in-house controller to spin up a Pod in a Namespace that has Istio Injection turned on. Fundamentally, our controller is taking a Deployment that works, copying out the spec.template.spec from it, and launching a fresh Pod with that PodSpec. We aren't setting any labels or annotations on the fresh pod (right now). TThis works totally fine for plain pods ... but when we try this on pods in istio-injection=enabled namespaces, we see the istio-validation container fail to work. The errors we get imply there is something wrong with the node, but we know that isn't the case because we have plenty of other workloads on those nodes working fine:

2022-11-23 15:30:24	
2022-11-23T23:30:24.217367Z	info	Starting iptables validation. This check verifies that iptables rules are properly established for the network.
2022-11-23 15:30:24	
2022-11-23T23:30:24.217468Z	info	Listening on 127.0.0.1:15001
2022-11-23 15:30:24	
2022-11-23T23:30:24.217662Z	info	Listening on 127.0.0.1:15006
2022-11-23 15:30:24	
2022-11-23T23:30:24.217819Z	error	Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2022-11-23 15:30:25	
2022-11-23T23:30:25.218219Z	error	Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2022-11-23 15:30:26	
2022-11-23T23:30:26.219418Z	error	Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2022-11-23 15:30:27	
2022-11-23T23:30:27.219751Z	error	Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2022-11-23 15:30:28	
2022-11-23T23:30:28.219994Z	error	Error connecting to 127.0.0.6:15002: dial tcp 127.0.0.1:0->127.0.0.6:15002: connect: connection refused
2022-11-23 15:30:29	
2022-11-23T23:30:29.217968Z	error	iptables validation failed; workload is not ready for Istio.
2022-11-23 15:30:29	
When using Istio CNI, this can occur if a pod is scheduled before the node is ready.
2022-11-23 15:30:29	
2022-11-23 15:30:29	
If installed with 'cni.repair.deletePods=true', this pod should automatically be deleted and retry.
2022-11-23 15:30:29	
Otherwise, this pod will need to be manually removed so that it is scheduled on a node with istio-cni running, allowing iptables rules to be established.
2022-11-23 15:30:29

The istio-cni pod logs look stange too... they claim we don't have the annotation in place, but we have the annotation on the namespace itself:

2022-11-23T23:30:23.837295Z	info	cni	istio-cni cmdAdd with k8s args: {CommonArgs:{IgnoreUnknown:true} IP:<nil> K8S_POD_NAME:diranged-v5njn-9474d53d K8S_POD_NAMESPACE:myns K8S_POD_INFRA_CONTAINER_ID:d30d3de86542b6dfc2a9ff4b32477c9412079c235779b47290685811bafc3f71}
2022-11-23T23:30:23.837349Z	info	cni	Pod myns/diranged-v5njn-9474d53d excluded due to not containing sidecar annotation
@diranged diranged added the bug Something isn't working label Nov 24, 2022
diranged added a commit that referenced this issue Nov 28, 2022
Closes #27.

The original code would createOrUpdate the `Pod` resource. The problem
is that we were then overwriting the `metadata.annotations` field on
updates.

The issue we ran into was this...

1. Oz creates the Pod
2. Istio's Webhook Endpoint mutates the Pod Labels and Annotations
3. Oz's secondary reconcile loop immediately comes in and replaces the
   metadata.annotations with the original empty annotations
4. Istio doesn't re-apply the annotations because the metadata.labels
   were mutated and indicate that the webhook has already happened.
5. Istio-validation container won't start up
diranged added a commit that referenced this issue Nov 28, 2022
Closes #27.

The original code would createOrUpdate the `Pod` resource. The problem
is that we were then overwriting the `metadata.annotations` field on
updates.

The issue we ran into was this...

1. Oz creates the Pod
2. Istio's Webhook Endpoint mutates the Pod Labels and Annotations
3. Oz's secondary reconcile loop immediately comes in and replaces the
   metadata.annotations with the original empty annotations
4. Istio doesn't re-apply the annotations because the metadata.labels
   were mutated and indicate that the webhook has already happened.
5. Istio-validation container won't start up
diranged added a commit that referenced this issue Nov 28, 2022
Closes #27.

The original code would createOrUpdate the `Pod` resource. The problem
is that we were then overwriting the `metadata.annotations` field on
updates.

The issue we ran into was this...

1. Oz creates the Pod
2. Istio's Webhook Endpoint mutates the Pod Labels and Annotations
3. Oz's secondary reconcile loop immediately comes in and replaces the
metadata.annotations with the original empty annotations
4. Istio doesn't re-apply the annotations because the metadata.labels
were mutated and indicate that the webhook has already happened.
5. Istio-validation container won't start up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant