-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scaling: fix state store corruption bug for job scaling events #23673
Conversation
When updating a `JobScalingEvent`, the state store function did not copy the existing object before mutating it. This corrupts the state store because it modifies the leaf node without committing it in a transaction. It can also cause the Nomad server to crash with a "fatal error: concurrent map read and map write" if its `ScalingEvents` map is read via the `ScaleStatus` RPC at the same time as it's being written. This changeset also removes some mostly-unused public methods on the struct that dangerously encourage you to mutate it outside of a copy. Ref: https://hashicorp.atlassian.net/browse/NET-10529
67b88cd
to
b112548
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @tgross!
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
When updating a
JobScalingEvent
, the state store function did not copy the existing object before mutating it. This corrupts the state store because it modifies the leaf node without committing it in a transaction. It can also cause the Nomad server to crash with a "fatal error: concurrent map read and map write" if itsScalingEvents
map is read via theScaleStatus
RPC at the same time as it's being written.This changeset also removes some mostly-unused public methods on the struct that dangerously encourage you to mutate it outside of a copy.
Ref: https://hashicorp.atlassian.net/browse/NET-10529