You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We use grpcio1.35.0 in Python 3.7.7 on Macs and Linux.
At my company, we have a longstanding bug where a server will randomly lose its GRPC connection to an upstream server and then take a long time to reconnect. I had some time to look into it today. I experimented with the keepalive settings, but then I found the reconnect backoff settings. I discovered that the default reconnect timeout is 20 seconds - a very long time for our use case!
According to the docs, I can set the following channel options
grpc.initial_reconnect_backoff_ms
grpc.min_reconnect_backoff_ms
grpc.max_reconnect_backoff_ms
But that doesn't let me set MIN_CONNECT_TIMEOUT, MULTIPLIER, or JITTER. These are all important, but MIN_CONNECT_TIMEOUT is most important since if I use the default of 20 seconds, the server will be unavailable for 20 seconds, even though if it were to retry, it would most likely recover right away.
Describe the solution you'd like
It would be easier to fix both this bug and similar bugs in the future if you let us control all the variables used in the algorithm directly. For now, I think I found a solution (see below), but it isn't ideal.
Describe alternatives you've considered
After doing some testing (by connecting to a non-existent server in a loop and seeing how long requests blocked before failing), I found that if I set grpc.initial_reconnect_backoff_ms, that also sets the initial backoff timeout. This is unexpected, but I have a way forward for fixing the bug for now.
The text was updated successfully, but these errors were encountered:
I'm not sure what the issue here is.. GRPC_ARG_INITIAL_RECONNECT_BACKOFF_MS i.e. "grpc.initial_reconnect_backoff_ms" corresponds to INITIAL_BACKOFF from the algorithm.
The other configurable args are - GRPC_ARG_MIN_RECONNECT_BACKOFF_MS "grpc.min_reconnect_backoff_ms" which corresponds to MIN_CONNECT_TIMEOUT GRPC_ARG_MAX_RECONNECT_BACKOFF_MS "grpc.max_reconnect_backoff_ms" which corresponds to MAX_BACKOFF
Is your feature request related to a problem? Please describe.
We use
grpcio
1.35.0
in Python3.7.7
on Macs and Linux.At my company, we have a longstanding bug where a server will randomly lose its GRPC connection to an upstream server and then take a long time to reconnect. I had some time to look into it today. I experimented with the keepalive settings, but then I found the reconnect backoff settings. I discovered that the default reconnect timeout is 20 seconds - a very long time for our use case!
I found https://github.com/grpc/grpc/blob/master/doc/connection-backoff.md which describes the algorithm you use for reconnect backoff. But, I can't figure out how to configure all of the variables in the algorithm.
According to the docs, I can set the following channel options
grpc.initial_reconnect_backoff_ms
grpc.min_reconnect_backoff_ms
grpc.max_reconnect_backoff_ms
But that doesn't let me set
MIN_CONNECT_TIMEOUT
,MULTIPLIER
, orJITTER
. These are all important, butMIN_CONNECT_TIMEOUT
is most important since if I use the default of 20 seconds, the server will be unavailable for 20 seconds, even though if it were to retry, it would most likely recover right away.Describe the solution you'd like
It would be easier to fix both this bug and similar bugs in the future if you let us control all the variables used in the algorithm directly. For now, I think I found a solution (see below), but it isn't ideal.
Describe alternatives you've considered
After doing some testing (by connecting to a non-existent server in a loop and seeing how long requests blocked before failing), I found that if I set
grpc.initial_reconnect_backoff_ms
, that also sets the initial backoff timeout. This is unexpected, but I have a way forward for fixing the bug for now.The text was updated successfully, but these errors were encountered: