-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tight container limits may cause "read init-p: connection reset by peer" #1914
Comments
[#159069922] Submodule src/code.cloudfoundry.org/garden-integration-tests 289117c..e2d8121: > Increase pids limits to fix connection reset flake See opencontainers/runc#1914
Hi @danail-branekov, I'm trying to get a better understanding of the series of actions that lead to this error.
Is this happening because goroutines are assigned their own pids? |
Hi @kkallday As noted above, if you artificially wait some time for the golang runtime go routines finish their initialisation and complete, then the issue is gone as the user process |
@danail-branekov gotcha - I didn't know that TIDs count towards the One more question: when you I'm new here - trying to get a better understanding of the project. Thanks in advance! 😄 |
Well, yes, sleeping is just a hack/workaround to prove that we get the |
Ideally we would know when we should contain the process with a liveliness check of the Go runtime but that's not really doable (you could try to do it by doing even more synchronisation -- but that's what #1916 will do implicitly). |
Steps to reproduce:
runc create <id>
echo 1 > /sys/fs/cgroup/pids/.../<id>/pids.max
runc exec <id> /bin/echo hi
. The following error occurs:After some debugging we found out what causes this error:
In order to prove that we added a sleep of 100ms before the process is joined to the cgroup and this significantly reduced the failure rate. Removing the code that joins the cgroup "fixed" it entirely.
We realise that such a tight limit has quite a limited practical use but we wanted to share the knowledge with the community. We believe that this error may also occur when exceeding any container cgroup limit (e.g. memory, cpu, pids).
Cheers, CF Garden Team
The text was updated successfully, but these errors were encountered: