fix: non-retryable RPCs use totalTimeout #1149

noahdietz · 2020-07-13T21:49:28Z

This PR essentially reverts #745, thus making non-retryable RPCs reference the totalTimeout as the single RPC timeout (which was done in #712), rather than choosing from (intial | max | total).

The impetus for #745 was googleapis/google-cloud-java#5555, where a gax-java change to using totalTimeout for non-retried RPCs resulted in an API returning validation errors for its specific-client, because that timeout value was too high. The Cloud Tasks library has since had its timeouts lowered to below the backend validation threshold.

Prioritizing initial over max over total, based on if the value was set or not, we effectively lowered the timeouts for all non-retryable RPCs to the (historically) lowest of the three config values. This is because historically all three values are always set to something. Service teams might simply change an RPC from retryable to non-retryable not knowing that they have lowered the user-perceived timeout of the method.

Furthermore, in order for a user to change the timeout of a non-retryable RPC, they have to change all three timeout config values because gax-java asserts that initialTimeout <= maxTimeout <= totalTimeout. Generated documentation in clients shows an example of increasing a method's timeout by changing the totalTimeout value alone. This would do nothing for a non-retryable RPC, because the initialTimeout value would still be set and chosen. Similarly for service producers, if they wanted to increase the timeout for a non-retryable RPC in the generation config, they'd need to change all three values to satisfy the gax-java assertions, or set initial && maxTimeout to 0 and setting totalTimeout to the desired value.

IMO, it is slightly more intuitive to change the totalTimeout when a user (at runtime via gax-java api) or service team (at generation time via config) wants to increase the timeout perceived by the GAPIC-caller, rather all three values. I believe this is generated as example documentation in Java GAPICs for the same reason. This also lines up better with other languages that use the method-level timeout_millis config value to specify timeout for non-retryable RPCs.

codecov · 2020-07-13T21:54:05Z

Codecov Report

Merging #1149 into master will increase coverage by 0.00%.
The diff coverage is 50.00%.

@@            Coverage Diff            @@
##             master    #1149   +/-   ##
=========================================
  Coverage     78.65%   78.66%           
+ Complexity     1170     1168    -2     
=========================================
  Files           204      204           
  Lines          5195     5188    -7     
  Branches        417      415    -2     
=========================================
- Hits           4086     4081    -5     
+ Misses          935      934    -1     
+ Partials        174      173    -1

Impacted Files	Coverage Δ	Complexity Δ
...rc/main/java/com/google/api/gax/rpc/Callables.java	`68.62% <50.00%> (-0.34%)`	`12.00 <0.00> (-2.00)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0512a15...4d1d891. Read the comment docs.

noahdietz · 2020-07-17T19:57:41Z

Friendly bump on this PR! Thanks folks.

noahdietz · 2020-07-23T23:18:02Z

Friendly bump here. If folks think this is the wrong approach, I'd appreciate some feedback.

igorbernstein2

LGTM for the overall change.
LGTM from the bigtable side.
My only reservation is that for services where rpc timeout != totaltimeout for nonidempotent rpcs. But I dont have enough knowledge to say if its a real risk

vam-google

LGTM

fix: non-retryable RPCs use totalTimeout

4d1d891

noahdietz requested review from chingor13, miraleung, igorbernstein2 and vam-google July 13, 2020 21:49

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 13, 2020

noahdietz mentioned this pull request Jul 16, 2020

WIP: change timeout-backoff to logical call timeout #1152

Closed

chingor13 approved these changes Jul 23, 2020

View reviewed changes

igorbernstein2 approved these changes Jul 24, 2020

View reviewed changes

vam-google approved these changes Jul 24, 2020

View reviewed changes

noahdietz merged commit b7646a3 into googleapis:master Jul 27, 2020

noahdietz deleted the non-retry-timeout branch July 27, 2020 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: non-retryable RPCs use totalTimeout #1149

fix: non-retryable RPCs use totalTimeout #1149

noahdietz commented Jul 13, 2020 •

edited

Loading

codecov bot commented Jul 13, 2020 •

edited

Loading

noahdietz commented Jul 17, 2020

noahdietz commented Jul 23, 2020

igorbernstein2 left a comment

vam-google left a comment

fix: non-retryable RPCs use totalTimeout #1149

fix: non-retryable RPCs use totalTimeout #1149

Conversation

noahdietz commented Jul 13, 2020 • edited Loading

codecov bot commented Jul 13, 2020 • edited Loading

Codecov Report

noahdietz commented Jul 17, 2020

noahdietz commented Jul 23, 2020

igorbernstein2 left a comment

Choose a reason for hiding this comment

vam-google left a comment

Choose a reason for hiding this comment

noahdietz commented Jul 13, 2020 •

edited

Loading

codecov bot commented Jul 13, 2020 •

edited

Loading