Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: writeback rollout updates to informer to prevent stale data #726

Merged
merged 1 commit into from
Sep 22, 2020

Conversation

jessesuen
Copy link
Member

@jessesuen jessesuen commented Sep 20, 2020

Resolves #720

Currently the rollout controller may operate on stale information in the informer cache. This can easily happen when a rollout is requeued into the workqueue while already in the middle of reconciliation. When this happens, the controller immediately re-reconciles the rollout, but this time operating on a stale version of the rollout in the cache (it is stale because the changes to the rollout saved in first reconciliation, have not yet made the round trip into the cache).

At best, this simply duplicates work. At worst, it causes incorrect behavior (such as re-pausing after an unpause as in #720)

There is a technique which argo workflows uses that writes back the updated resource back into the informer cache. Using this technique, even if a rollout is immediately re-reconciled, it at least has the updates saved in the previous reconciliation.

Before this change, TestCanarySetCanaryScale could reproduce the double-pause problem in #720 quite easily. After this fix, I am unable to reproduce the problem.

@codecov-commenter
Copy link

codecov-commenter commented Sep 20, 2020

Codecov Report

Merging #726 into master will decrease coverage by 0.04%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #726      +/-   ##
==========================================
- Coverage   82.96%   82.91%   -0.05%     
==========================================
  Files          95       95              
  Lines        7912     7926      +14     
==========================================
+ Hits         6564     6572       +8     
- Misses        947      951       +4     
- Partials      401      403       +2     
Impacted Files Coverage Δ
rollout/context.go 90.00% <ø> (ø)
rollout/controller.go 71.00% <60.00%> (-0.50%) ⬇️
rollout/sync.go 71.23% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa3ddf0...c18ca32. Read the comment docs.

Copy link
Contributor

@khhirani khhirani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jessesuen jessesuen merged commit a96bbdb into argoproj:master Sep 22, 2020
@jessesuen jessesuen deleted the informer-writeback branch September 22, 2020 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rollout can pause twice for same canary pause step
3 participants