-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: infinite hang on dial #16060
Comments
I think there is something wrong with your go install. GOOS is set to On Tue, 14 Jun 2016, 21:24 Steven Hartland [email protected] wrote:
|
GOHOSTOS != GOOS - it's cross compiled so the embedded source paths will be Linux like? EDIT: Ah see comment below - looks like OP didn't paste a clean |
I think that is just an artefact of the local environment from when we were cross-compiling for windows - however the binary in question was compiled for freebsd. |
Can you please provide the source code to a program that demonstrates the On Tue, 14 Jun 2016, 21:39 Andrew Montgomery-Hurrell <
|
We're not been able to create a simple test case for this as yet unfortunately. It seems like there's either a missing error check or a race condition during dial. I'm not sure how easy it will be to create a test case as I believe it needs a network failure to trigger. |
Its possible we may be seeing something related to: #14548 For reference the connections will be TLS connections. |
The bug report says:
But that's not what the provided goroutines say. Those three goroutines could just be an idle TCP connection owned by an http.Transport. It's been 33 minutes since that connection was used. I don't see enough to work on here. There's not enough code (no code) or goroutine context to help debug. I'm going to close this until there's something more we can help with. |
Unfortunately that's the problem though, I could have added the sockstat output for app at the time but it was empty, as there was no OS level socket present, hence go was waiting for something that it was never going to get. I totally admit that without a reproduction case this is likely going to be impossible to narrow down though, so if do manage to come up with that I'll reopen. |
Can you reproduce it yourself? Only on FreeBSD? Only on Go 1.6.2, or also Go 1.7? If possible, post a system call trace of it happening. Probably best to open a new bug, since comments on closed bugs are often lost. |
We've only seen it on FreeBSD and we run the app on FreeBSD, Linux (small subset) and Windows. We've only seen it once and that was 1.6.2 we'll raise a new bug when we reproduce it. |
@stevenh @bradfitz
|
Docker likes to never respond to us, and we do not usually have cancellations on the context (which would not help, after all, that would just fail the test right there). Instead, try a few times. The problem looks similar to golang/go#16060 golang/go#5103 Another possibility mentioned in usergroups is that some file descriptor limit is hit. Since I've never seen this locally, perhaps that's the case on our agent machines. Unfortunately, those are hard to SSH into. This may not be a good idea (after all, perhaps `Start()` succeeded) and we'd have to do something similar for `ContainerWait`. But, at least it should give us an additional data point: do the retries also just block? Is the container actually started when we retry?
Docker likes to never respond to us, and we do not usually have cancellations on the context (which would not help, after all, that would just fail the test right there). Instead, try a few times. The problem looks similar to golang/go#16060 golang/go#5103 Another possibility mentioned in usergroups is that some file descriptor limit is hit. Since I've never seen this locally, perhaps that's the case on our agent machines. Unfortunately, those are hard to SSH into. This may not be a good idea (after all, perhaps `Start()` succeeded) and we'd have to do something similar for `ContainerWait`. But, at least it should give us an additional data point: do the retries also just block? Is the container actually started when we retry?
Docker likes to never respond to us, and we do not usually have cancellations on the context (which would not help, after all, that would just fail the test right there). Instead, try a few times. The problem looks similar to golang/go#16060 golang/go#5103 Another possibility mentioned in usergroups is that some file descriptor limit is hit. Since I've never seen this locally, perhaps that's the case on our agent machines. Unfortunately, those are hard to SSH into. This may not be a good idea (after all, perhaps `Start()` succeeded) and we'd have to do something similar for `ContainerWait`. But, at least it should give us an additional data point: do the retries also just block? Is the container actually started when we retry?
Please answer these questions before submitting your issue. Thanks!
go version
)?1.6.2
go env
)?GOARCH="amd64"
GOBIN="/data/go/bin"
GOEXE=".exe"
GOHOSTARCH="amd64"
GOHOSTOS="freebsd"
GOOS="freebsd"
GOPATH="/data/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/freebsd_amd64"
GO15VENDOREXPERIMENT="1"
CC="clang"
GOGCCFLAGS="-m64 -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0"
CXX="clang++"
CGO_ENABLED="0"
Attempt an http request using http.Client
Success or failure of the request
Request hung indefinitely still attempting to "dial" but no active OS socket.
Here are traces from relevant goroutines:
The text was updated successfully, but these errors were encountered: