-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
premature "transport is closing" when using keepalive #3171
Comments
https://godoc.org/google.golang.org/grpc/keepalive#ClientParameters warns "Make sure these parameters are set in coordination with the keepalive policy on the server, as incompatible settings can result in closing of connection." Does that mean that merely the timeout parameters must match (and that the default ones are fine), or does it mean that the server also must enable https://godoc.org/google.golang.org/grpc#KeepaliveParams when the client does? I think it is the former as the latter doesn't make a difference in practice (at least not for this problem), but the wording IMHO is ambiguous. |
You need to set an EnforcementPolicy on the server side. The default EnforcementPolicy expects that the client send no more one keepalive ping every 5 minutes, which is clearly violated in the example you have. If you enable debugging on your server as mentioned here, you should see the following error on your server:
And once you set an enforcement policy on your server to match the expected rate of keepalive pings from your client, your stream should not be broken. HTH. Thanks for the detailed steps for reproduction. |
I do indeed see This combination works:
I find it surprising that the defaults on client and server side don't match, i.e. lead to client behavior that the server doesn't accept. Perhaps it's worthwhile to enhance the warning in the Other than that, this issue can be closed as the code works as intended. |
The gRPC defaults for a client do not work for a gRPC server that also uses the defaults (grpc/grpc-go#3171) which caused connection drops when a CSI driver is slow to respond and the client starts sending keepalive pings too quickly (kubernetes-csi#238). The sanity package no longer uses gRPC keepalive. Instead, custom test suites can enable that via some new configuration options if needed and when it is certain that the CSI driver under test can handle it.
The gRPC defaults for a client do not work for a gRPC server that also uses the defaults (grpc/grpc-go#3171) which caused connection drops when a CSI driver is slow to respond and the client starts sending keepalive pings too quickly (kubernetes-csi#238). The sanity package no longer uses gRPC keepalive. Instead, custom test suites can enable that via some new configuration options if needed and when it is certain that the CSI driver under test can handle it.
The client and server keepalives are considered sufficiently different enough to deserve separate designs. You can find the corresponding proposals here: And the reason why you are seeing the transport being closed on the client side after 30 seconds is not because of the sum of the default |
The gRPC defaults for a client do not work for a gRPC server that also uses the defaults (grpc/grpc-go#3171) which caused connection drops when a CSI driver is slow to respond and the client starts sending keepalive pings too quickly (kubernetes-csi#238). The sanity package no longer uses gRPC keepalive. Instead, custom test suites can enable that via some new configuration options if needed and when it is certain that the CSI driver under test can handle it.
What version of gRPC are you using?
Master = 51ac07f
What version of Go are you using (
go version
)?go version go1.13.4 linux/amd64
What operating system (Linux, Windows, …) and version?
Linux 5.2.0-0.bpo.2-amd64 (Debian Buster)
What did you do?
When enabling keepalive on the client side and running that client against a server which has a very long running operation (>200s), the client fails with "transport is closing". This can be avoided by not enabled keepalive on the client side.
The server ran without any special options. Adding
KeepaliveParameters
had no effect.This can be reproduce by patching the helloworld example:
What did you expect to see?
The client should have returned without error after 200s.
What did you see instead?
"transport is closing" error.
The text was updated successfully, but these errors were encountered: