-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[noderesourcetopology] update NRT when attributes change #631
[noderesourcetopology] update NRT when attributes change #631
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kubernetes-sigs-scheduler-plugins canceled.
|
020fa0f
to
30dd0d7
Compare
bb78532
to
8320fa2
Compare
8320fa2
to
c3390ce
Compare
c3390ce
to
f2477f6
Compare
f2477f6
to
53cd272
Compare
53cd272
to
0d6a82b
Compare
0d6a82b
to
0d4a769
Compare
b764751
to
46fd5c8
Compare
A upcoming change in the overreserve cache wants to consume nodeconfig utilities, so in order to enable both packages to consume this code, we move it in its own package. Trivial code movement and only necessary interface changes (renames/public <-> private symbols). Signed-off-by: Francesco Romani <[email protected]>
84394a7
to
0d31c28
Compare
76531af
to
910b12b
Compare
the integration test is pretty much done but thare are known issues I'm investigating. |
f16f925
to
8072fc4
Compare
/cc @Tal-or |
8072fc4
to
2afceb1
Compare
2afceb1
to
78e8c3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this.
This new functionality is opt-in in order to support backward compatibility.
But IIUC, the current behavior might causes wrong scheduling decision, would it be wise to keep the default ("buggy") behavior as an option at all?
One of the key assumptions we took when designing the NodeResourceTopology (NRT) plugin is that the kubelet configuration of the worker node changes *very* rarely, if at all, during the cluster lifetime. As rule of thumb, it was expected to change with a frequency of like once every quarter (3 months) or so, and likely less often. So the event of changing during a scheduling cycle was deemed extremely low. However, we fail to notice kubelet configuration changes (the bits we care reported by NRT data) and we update them incidentally when resyncing the cache. These updates are expected to be rare, but still failing to noticing them is a much worse issue: with out of date configuration information, the scheduler plugin will take wrong decisions until restarted. Up until now, the mitigation was to restart the scheduler once kubelet config changes; this works, but it is unpractical and requires more orchestration. We add the option to resync the NRT data if the attribute change (and nothing else did) to overcome this limitation. In the current framework, because of how controller-runtime/client works, this will require a new connection to the apiserver to watch the changes and react to them, adding the node to the set of the one being resynced in the next resync iteration. Even considering HA scheduler scenarios, this will cause a very limited amount of new connections, and should not cause any scalability concern. Nevertheless, we make the feature opt-in. Signed-off-by: Francesco Romani <[email protected]>
Switch the default and prefer correctness over backward compatibility, because the previous behavior was buggy (see: kubernetes-sigs#621) Signed-off-by: Francesco Romani <[email protected]>
That's a good point. We add a extra watch, but previously the behavior was arguably buggy. Let me switch the defaults. |
78e8c3e
to
bbadc64
Compare
/hold Hold letting other reviews time to chime in |
@PiotrProkop howdy! Do you have any comments by any chance? |
/hold cancel |
@ffromani sorry I was on vacations. Looks good to me! |
What type of PR is this?
/kind bug
What this PR does / why we need it:
One of the key assumptions we took when designing the NodeResourceTopology (NRT) plugin is that the kubelet config of the worker node changes VERY rarely, if at all, during the cluster lifetime. As rule of thumb, it was expected to change with a frequency of like once every quarter (3 months) or so, and likely less often. So the event of changing during a scheduling cycle was deemed extremely low.
However, we fail to notice kubelet configuration changes (the bits we care reported by NRT data) and we update them incidentally when resyncing the cache.
These updates are expected to be rare, but still failing to noticing them is a different issue because it can lead to unnecessarily bad scheduling decision.
Simply put, failing to noticing these updates is a gap, which we address with this PR
Which issue(s) this PR fixes:
Fixes #621
Special notes for your reviewer:
still WIP, more test ongoing and upcoming
Does this PR introduce a user-facing change?