Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand OkHttp retry exception predicate #7047

Merged

Conversation

jack-berg
Copy link
Member

Reflects discussion here.

  • Expand to retry on all SocketTimeoutExceptions, instead of looking for specific messages.
  • Expand to retry on UnknownHostException

This brings OkHttp more in line with JdkHttpSender, which retries on any IOException except SSLException.

@jack-berg jack-berg requested a review from a team as a code owner January 27, 2025 16:55
Copy link

codecov bot commented Jan 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.95%. Comparing base (5e1a397) to head (91cc5ec).
Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #7047      +/-   ##
============================================
- Coverage     89.96%   89.95%   -0.01%     
+ Complexity     6663     6661       -2     
============================================
  Files           748      748              
  Lines         20085    20086       +1     
  Branches       1970     1970              
============================================
- Hits          18069    18068       -1     
- Misses         1423     1425       +2     
  Partials        593      593              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jack-berg jack-berg merged commit 1c1d561 into open-telemetry:main Jan 28, 2025
25 checks passed
@YuriyHolinko
Copy link
Contributor

Nice PR 👍

I just want to add that UnknownHostException is not always transient. e.g if the hostname/DNS does not exist at all, there is no need for retry as it will not help

So could you tell the arguments why adding it to defaults is reasonable ?

@jack-berg
Copy link
Member Author

Yes you're right - the ability for a retry to succeed when encountering a DNS hostname error is situational. Reasons for retrying by default:

  • There are cases where DNS resolution is temporary, which retrying helps. If the export is never going to succeed, then retrying only temporarily delays the time before we log an error. Subsequent exports will also fail for the same reason the retry attempts will fail. And so the user has to see the log error and update configuration to fix. Put another way, retrying on DNS resolution will only help some users, but it won't hurt any users.
  • I chatted with my colleague @alanwest, who is a maintainer for opentelemetry-dotnet. He mentioned that dotnet's OTLP exporters do retry on DNS issues, based on an interpretation of of the OTLP spec. Their read is that DNS resolution issues qualify as the "the client cannot connect to the server", and thus should be retried:

If the client cannot connect to the server, the client SHOULD retry the connection using an exponential backoff strategy between retries. The interval between retries must have a random jitter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants