Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't lie about retrying in when jitter is enabled #18

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

moskyb
Copy link
Collaborator

@moskyb moskyb commented Jan 13, 2025

The roko.Retrier's String() method returns a message about the state of the retry loop, generally something of the form Attempt 3/19 Retrying in 6s. Previously, when jitter was enabled for a retrier, the call to NextInterval() within the String() method and the actual retry loop itself would be totally independent, and so if random jitter was enabled, they'd return different results - ie, String() would say Attempt 3/19 Retrying in 5.42038947s, but the retry loop would calculate a totally different value for jitter, and wait a longer or shorter amount of time.

This wasn't so bad when jitter was limited to one second - String() might say nest attempt in 5.01s, and then the retrier would actually sleep for 5.99s, but what's a second or two between friends, right?

Unfortunately, in #17, some jabroni (me) added the ability for users to customise the range that jitter falls within, so it's now possible for String() to lie fairly significantly about the amount of time it's going to wait till the next request. If we configured a jitter range of [-10s, 30s] for our jitter range, the next interval reported by String() could be off by as much as 40 seconds, which is pretty bad.

This PR fixes this issue by refactoring roko.Retrier a little bit - now, at the start of the retry loop - before the callback is called - and stored in the state of the retrier. This means that any accesses to the next interval time will be consistent, even with jitter enabled.

The `roko.Retrier`'s `String()` method returns a message about the state of the retry loop, generally something of the form `Attempt 3/19 Retrying in 6s`. Previously, when jitter was enabled for a retrier, the call to `NextInterval()` within the `String()` method and the actual retry loop itself would be totally independent, and so if random jitter was enabled, they'd return different results - ie, `String()` would say `Attempt 3/19 Retrying in 5.42038947s`, but the retry loop would calculate a totally different value for jitter, and wait a longer or shorter amount of time.

This wasn't so bad when jitter was limited to one second - `String()` might say `nest attempt in 5.01s`, and then the retrier would actually sleep for `5.99s`, but what's a second or two between friends, right?

Unfortunately, in [#17](#17), some jabroni (me) added the ability for users to customise the range that jitter falls within, so it's now possible for `String()` to lie fairly significantly about the amount of time it's going to wait till the next request. If we configured a jitter range of [-10s, 30s] for our jitter range, the next interval reported by `String()` could be off by as much as 40 seconds, which is pretty bad.

This PR fixes this issue by refactoring `roko.Retrier` a little bit - now, at the start of the retry loop - before the callback is called - and stored in the state of the retrier. This means that any accesses to the next interval time will be consistent, even with jitter enabled.
@moskyb moskyb requested a review from a team January 13, 2025 01:12
Copy link

@CerealBoy CerealBoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@moskyb moskyb merged commit 0f84552 into main Jan 14, 2025
1 check passed
@moskyb moskyb deleted the dont-lie-about-retrying-in-time-jitter branch January 14, 2025 04:27
@moskyb moskyb mentioned this pull request Jan 14, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants