-
Notifications
You must be signed in to change notification settings - Fork 280
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows container networking is not stable #6093
Comments
Do you have network policies? |
We don't have any customized polices. I ran the command the results are given below. Looks to me these are default. Please share your thoughts. NAMESPACE - NAME - POD-SELECTOR |
It does not matter if they are default. Any creation of a pod matching those policies or any deletion of a pod matching those policies will reset the ACL object in Windows. As a consequence, any TCP Session established by any pod included in that ACL will TCP-RST |
Give |
But the above mentioned polices are not selecting the windows pods. It is selecting 1 pod each policy and those are running under linux nodes. Windows Air-Gap Install |
Note that the docs say
Note that the endpoint that the Windows pod communicates with does not ALSO have to be on Windows; the Windows pod may experience a disruption if a pod on a completely different node is recreated and the endpoint IP changes. This is a defect in Windows's HNS subsystem, not in canal, containerd, kubernetes, or RKE2. Please try flannel as @manuelbuil suggested. |
Due to the network policy problem, we introduced Flannel recently. I missed that line in the docs! Thanks, I'll change it :) Flannel can be used as production, yes |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Environmental Info:
RKE2 Version:
1.29.2
Node(s) CPU architecture, OS, and Version:
4 Nodes, 3 Linux (Redhat 8), 1 windows (windows server 2022)
Cluster Configuration:
3 linux servers, 1 windows agent
Describe the bug:
I have multiple deployments which have init and normal containers in each one. The init container will call the some Kube APIs to get resources information also update some annotations in the same deployment. Also each deployment has a service running on and one service can communicate with another. The Linux init container and communication between linux service works fine.
But windows init container when it is calling Kuber api get resource information it is getting connection closed by remote host issue. Then it crashing and starting again. Sometime it is starting without error. Sometime 2 or 3 times get this error then starts fine.
Same way from normal containers if it is trying to access a linux service it is getting timeout issue. After 1 or 2 retry it is successful.
It looks to me the windows rke doesn't have stable kubenetes connectivity with windows Nodes.
In linux we are using calico.
Steps To Reproduce:
Run a application to update annotations from init container for the same running deployment.
Expected behavior:
Should not see any communication failures in windows containers
Actual behavior:
Lot of communication failures in windows containers
Additional context / logs:
The text was updated successfully, but these errors were encountered: