This repository has been archived by the owner on Sep 26, 2023. It is now read-only.
fix: non-retryable RPCs use totalTimeout #1149
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR essentially reverts #745, thus making non-retryable RPCs reference the
totalTimeout
as the single RPC timeout (which was done in #712), rather than choosing from (intial | max | total).The impetus for #745 was googleapis/google-cloud-java#5555, where a
gax-java
change to usingtotalTimeout
for non-retried RPCs resulted in an API returning validation errors for its specific-client, because that timeout value was too high. The Cloud Tasks library has since had its timeouts lowered to below the backend validation threshold.Prioritizing initial over max over total, based on if the value was set or not, we effectively lowered the timeouts for all non-retryable RPCs to the (historically) lowest of the three config values. This is because historically all three values are always set to something. Service teams might simply change an RPC from retryable to non-retryable not knowing that they have lowered the user-perceived timeout of the method.
Furthermore, in order for a user to change the timeout of a non-retryable RPC, they have to change all three timeout config values because gax-java asserts that
initialTimeout
<=maxTimeout
<=totalTimeout
. Generated documentation in clients shows an example of increasing a method's timeout by changing thetotalTimeout
value alone. This would do nothing for a non-retryable RPC, because theinitialTimeout
value would still be set and chosen. Similarly for service producers, if they wanted to increase the timeout for a non-retryable RPC in the generation config, they'd need to change all three values to satisfy the gax-java assertions, or setinitial
&&maxTimeout
to 0 and settingtotalTimeout
to the desired value.IMO, it is slightly more intuitive to change the
totalTimeout
when a user (at runtime via gax-java api) or service team (at generation time via config) wants to increase the timeout perceived by the GAPIC-caller, rather all three values. I believe this is generated as example documentation in Java GAPICs for the same reason. This also lines up better with other languages that use the method-leveltimeout_millis
config value to specify timeout for non-retryable RPCs.