-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
acceptance: make docker more resilient to timeout in ContainerStart #17800
Conversation
Seems worthwhile to me; thanks for doing this. Reviewed 1 of 1 files at r1. pkg/acceptance/cluster/docker.go, line 324 at r1 (raw file):
help me understand this unusual use of context:
after re-reading the code, I think maybe you just want to Comments from Reviewable |
Review status: all files reviewed at latest revision, 1 unresolved discussion, some commit checks failed. pkg/acceptance/cluster/docker.go, line 324 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
As you've pointed out this code was total 💩, rewritten! Comments from Reviewable |
553a3ff
to
7ffe1b4
Compare
Reviewed 1 of 1 files at r2. pkg/acceptance/cluster/docker.go, line 332 at r2 (raw file):
remove this empty line? pkg/acceptance/cluster/docker.go, line 335 at r2 (raw file):
i think you can remove this clause:
Comments from Reviewable |
Docker likes to never respond to us, and we do not usually have cancellations on the context (which would not help, after all, that would just fail the test right there). Instead, try a few times. The problem looks similar to golang/go#16060 golang/go#5103 Another possibility mentioned in usergroups is that some file descriptor limit is hit. Since I've never seen this locally, perhaps that's the case on our agent machines. Unfortunately, those are hard to SSH into. This may not be a good idea (after all, perhaps `Start()` succeeded) and we'd have to do something similar for `ContainerWait`. But, at least it should give us an additional data point: do the retries also just block? Is the container actually started when we retry?
TFTR! Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/acceptance/cluster/docker.go, line 332 at r2 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Done. pkg/acceptance/cluster/docker.go, line 335 at r2 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Done. Comments from Reviewable |
Reviewed 1 of 1 files at r3. Comments from Reviewable |
Docker likes to never respond to us, and we do not usually have cancellations
on the context (which would not help, after all, that would just fail the test
right there). Instead, try a few times.
The problem looks similar to
golang/go#16060
golang/go#5103
Another possibility mentioned in usergroups is that some file descriptor limit
is hit. Since I've never seen this locally, perhaps that's the case on our
agent machines. Unfortunately, those are hard to SSH into.
This may not be a good idea (after all, perhaps
Start()
succeeded) and we'dhave to do something similar for
ContainerWait
. But, at least it shouldgive us an additional data point: do the retries also just block? Is the
container actually started when we retry?