-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: extremely rare race conditions in retry loop #7789
fix: extremely rare race conditions in retry loop #7789
Conversation
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
Codecov Report
@@ Coverage Diff @@
## main #7789 +/- ##
==========================================
- Coverage 95.16% 95.15% -0.01%
==========================================
Files 1278 1278
Lines 115102 115114 +12
==========================================
+ Hits 109536 109539 +3
- Misses 5566 5575 +9
Continue to review full report at Codecov.
|
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically, this LGTM. The documentation was helpful for understanding why we tag each operation with the iteration count.
* | ||
* - The initial value of `cancelled_` is false. | ||
* - `Cancel()` is the only operation that changes `cancelled_`, and it holds | ||
* the `mu_` mutex while doing son. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/son/so/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* starts, as it is their callbacks who start the next step. | ||
* - `StartIteration()` is always sequenced-before calls to `SetPending()`. | ||
* - `SetPending()` never sets the `pending_operation_` to the future for | ||
* an operation that already completed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/that already/that has already/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
* - `StartIteration()` and `SetPending()` both lock the same mutex `mu_` as | ||
* `Cancel()`. | ||
* - It follows that if `Cancel()` is invoked, then the `true` value will be | ||
* visible any future calls to `StartIteration()` or `SetPending()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/visible/visible to/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bool cancelled_ = false; | ||
bool done_ = false; | ||
std::uint_fast32_t iteration_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL.
Do you want: s/fast32/fast64/ to match State::iteration
, SetPending
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed.
* as both `StartAttempt()` and `StartBackoff()` return immediately in this | ||
* case. | ||
* - Note that if `cancelled_` is true, `StartIteration()` terminates the | ||
* retry loop by calling `SetDoneWithCancel()` if `cancelled_` is true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/`SetDoneWithCancel()` if `cancelled_` is true./`SetDoneWithCancel()`./
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
Google Cloud Build Logs
ℹ️ NOTE: Kokoro logs are linked from "Details" below. |
Fixes #7788
I could not write a unit test that would reliably introduce the race condition.
This change isdata:image/s3,"s3://crabby-images/d0bb7/d0bb7f7625ca5bf5c3cf7a2b7a514cf841ab8395" alt="Reviewable"