Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[noderesourcetopology] update NRT when attributes change #631

Merged

Conversation

ffromani
Copy link
Contributor

What type of PR is this?

/kind bug

What this PR does / why we need it:

One of the key assumptions we took when designing the NodeResourceTopology (NRT) plugin is that the kubelet config of the worker node changes VERY rarely, if at all, during the cluster lifetime. As rule of thumb, it was expected to change with a frequency of like once every quarter (3 months) or so, and likely less often. So the event of changing during a scheduling cycle was deemed extremely low.

However, we fail to notice kubelet configuration changes (the bits we care reported by NRT data) and we update them incidentally when resyncing the cache.
These updates are expected to be rare, but still failing to noticing them is a different issue because it can lead to unnecessarily bad scheduling decision.

Simply put, failing to noticing these updates is a gap, which we address with this PR

Which issue(s) this PR fixes:

Fixes #621

Special notes for your reviewer:

still WIP, more test ongoing and upcoming

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. labels Aug 31, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from zwpaper August 31, 2023 11:18
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 31, 2023
@netlify
Copy link

netlify bot commented Aug 31, 2023

Deploy Preview for kubernetes-sigs-scheduler-plugins canceled.

Name Link
🔨 Latest commit bbadc64
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-scheduler-plugins/deploys/666708c99f05c00007acfa91

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 31, 2023
@ffromani ffromani force-pushed the handle-attribute-updates branch 2 times, most recently from 020fa0f to 30dd0d7 Compare September 5, 2023 10:45
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 5, 2023
@ffromani ffromani force-pushed the handle-attribute-updates branch 2 times, most recently from bb78532 to 8320fa2 Compare September 6, 2023 08:50
@ffromani ffromani force-pushed the handle-attribute-updates branch from 8320fa2 to c3390ce Compare October 18, 2023 09:27
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 15, 2023
@ffromani ffromani force-pushed the handle-attribute-updates branch from c3390ce to f2477f6 Compare November 30, 2023 07:36
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 30, 2023
@ffromani ffromani force-pushed the handle-attribute-updates branch from f2477f6 to 53cd272 Compare November 30, 2023 14:00
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 20, 2024
@ffromani ffromani force-pushed the handle-attribute-updates branch from 53cd272 to 0d6a82b Compare January 22, 2024 07:33
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 19, 2024
@ffromani ffromani force-pushed the handle-attribute-updates branch from 0d6a82b to 0d4a769 Compare June 4, 2024 13:09
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 4, 2024
@ffromani ffromani force-pushed the handle-attribute-updates branch 2 times, most recently from b764751 to 46fd5c8 Compare June 4, 2024 15:00
A upcoming change in the overreserve cache wants to
consume nodeconfig utilities, so in order to enable both
packages to consume this code, we move it in its own
package. Trivial code movement and only necessary interface
changes (renames/public <-> private symbols).

Signed-off-by: Francesco Romani <[email protected]>
@ffromani ffromani force-pushed the handle-attribute-updates branch from 84394a7 to 0d31c28 Compare June 5, 2024 17:39
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 5, 2024
@ffromani ffromani force-pushed the handle-attribute-updates branch 2 times, most recently from 76531af to 910b12b Compare June 6, 2024 08:02
@ffromani
Copy link
Contributor Author

ffromani commented Jun 6, 2024

the integration test is pretty much done but thare are known issues I'm investigating.

@ffromani ffromani force-pushed the handle-attribute-updates branch 2 times, most recently from f16f925 to 8072fc4 Compare June 6, 2024 10:43
@ffromani ffromani changed the title WIP: update NRT when attributes change update NRT when attributes change Jun 6, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 6, 2024
@ffromani
Copy link
Contributor Author

ffromani commented Jun 6, 2024

/cc @Tal-or

@k8s-ci-robot k8s-ci-robot requested a review from Tal-or June 6, 2024 10:44
@ffromani ffromani force-pushed the handle-attribute-updates branch from 8072fc4 to 2afceb1 Compare June 6, 2024 10:57
@ffromani ffromani changed the title update NRT when attributes change [noderesourcetopology] update NRT when attributes change Jun 6, 2024
@ffromani ffromani force-pushed the handle-attribute-updates branch from 2afceb1 to 78e8c3e Compare June 6, 2024 12:14
Copy link
Contributor

@Tal-or Tal-or left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this.

This new functionality is opt-in in order to support backward compatibility.
But IIUC, the current behavior might causes wrong scheduling decision, would it be wise to keep the default ("buggy") behavior as an option at all?

pkg/noderesourcetopology/cache/overreserve.go Outdated Show resolved Hide resolved
ffromani added 2 commits June 10, 2024 16:03
One of the key assumptions we took when designing the
NodeResourceTopology (NRT) plugin is that the kubelet
configuration of the worker node changes *very* rarely,
if at all, during the cluster lifetime.
As rule of thumb, it was expected to change with a
frequency of like once every quarter (3 months) or so,
and likely less often. So the event of changing during
a scheduling cycle was deemed extremely low.

However, we fail to notice kubelet configuration changes
(the bits we care reported by NRT data) and we update
them incidentally when resyncing the cache.

These updates are expected to be rare, but still failing
to noticing them is a  much worse issue: with out of
date configuration information, the scheduler plugin
will take wrong decisions until restarted.

Up until now, the mitigation was to restart the scheduler
once kubelet config changes; this works, but it is
unpractical and requires more orchestration.

We add the option to resync the NRT data if the attribute
change (and nothing else did) to overcome this limitation.
In the current framework, because of how controller-runtime/client
works, this will require a new connection to the apiserver to
watch the changes and react to them, adding the node
to the set of the one being resynced in the next resync iteration.

Even considering HA scheduler scenarios, this will cause
a very limited amount of new connections, and should not
cause any scalability concern. Nevertheless, we make
the feature opt-in.

Signed-off-by: Francesco Romani <[email protected]>
Switch the default and prefer correctness over
backward compatibility, because the previous
behavior was buggy (see: kubernetes-sigs#621)

Signed-off-by: Francesco Romani <[email protected]>
@ffromani
Copy link
Contributor Author

Thank you for working on this.

This new functionality is opt-in in order to support backward compatibility. But IIUC, the current behavior might causes wrong scheduling decision, would it be wise to keep the default ("buggy") behavior as an option at all?

That's a good point. We add a extra watch, but previously the behavior was arguably buggy. Let me switch the defaults.

@ffromani ffromani force-pushed the handle-attribute-updates branch from 78e8c3e to bbadc64 Compare June 10, 2024 14:08
@Tal-or
Copy link
Contributor

Tal-or commented Jun 19, 2024

/hold
/lgtm

Hold letting other reviews time to chime in

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 19, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 19, 2024
@ffromani
Copy link
Contributor Author

@PiotrProkop howdy! Do you have any comments by any chance?

@ffromani
Copy link
Contributor Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 24, 2024
@k8s-ci-robot k8s-ci-robot merged commit 2c1c0cf into kubernetes-sigs:master Jun 24, 2024
10 checks passed
@ffromani ffromani deleted the handle-attribute-updates branch June 24, 2024 13:28
@PiotrProkop
Copy link
Contributor

@ffromani sorry I was on vacations. Looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubelet configuration change without immediate cache updates
4 participants