-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create network bridge and iptables chains at startup #6618
Comments
I think I am running into this (I hope the logs and the job files might prove helpful -- sorry if that is a completly different issue): Nomad versionNomad v0.11.1 (b434570) Operating system and Environment detailsDebian 10 IssueAfter rapid submission of two jobs that use groups (and in one instance connect) I get a failed setup of the job with the connect stanza (though that might have been by pure luck due to ordering etc…):
Reproduction stepsNot sure, I just submitted two job files (see below) Job file (if appropriate)File 1:
File 2:
|
@nickethier Did you make any progress in this area? I rebooted a node today (without draining first) and it resulted in different weird errors (up to segfaults in nftables/iptables). Sadly I do not have logs of the previous reboots -- will see if I can gather more the next time. |
Related #12103 |
In #6567 (comment) we encountered a case where concurrency issues in the CNI plugins caused allocation failures for Connect-enabled jobs. There was a similar one fixed in containernetworking/plugins#366
While we should and will help patch upstream, it might improve the user experience and reduce Nomad bug reports if we were to create the network bridge and iptables chains we need for Connect-enabled jobs at client startup, rather than waiting for a job allocation. This includes:
nomad
bridge networkCNI-HOSTPORT-SETMARK
CNI-HOSTPORT-SETMASQ
CNI-HOSTPORT-DNAT
CNI-FORWARD
NOMAD-ADMIN
I'm not sure we have a great place to do this work on startup, but maybe @nickethier @shoenig or @schmichael have an idea?
The text was updated successfully, but these errors were encountered: