-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2621: Add llc affinity to cpu manager. #2684
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Welcome @ranchothu! |
Hi @ranchothu. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ranchothu The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
ba5f08d
to
ccf74f1
Compare
@ranchothu: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @ranchothu, This currently isn't being tracked for SIG Node planning for 1.22: https://docs.google.com/document/d/1U10J0WwgWXkdYrqWGGvO8iH2HKeerQAlygnqgDgWv4E/edit# /hold It also appears that you will need to sign the Kubernetes CLA (see the bot comment above: #2684 (comment)). |
0a40710
to
40c8fd4
Compare
@pacoxu maybe a ok-to-test is in need, seems cla not recheck after force-pushes. And when i reply with |
/ok-to-test |
/retest |
I signed it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took an initial pass at the KEP. The motivation is clear to me but I would recommend to add more specific use case in terms of the performance sensitive workloads that have strictly need resource allocation while taking L3 cache into consideration.
### Risks and Mitigations | ||
|
||
+ Currently no risks was found. | ||
+ Feature is enbled by a gate - a new kube feature with default false, potential risk effects could be limited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: typo enbled
-> enabled
|
||
- Feature Gate | ||
- Add `CPUManagerUncoreCacheAlign` to kubelet's feature-gates to enable(true)/disable(false) the feature. | ||
- Also, more than one l3 cache should exist in a single socket/package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably implied but maybe we should explicitly capture the scenario where CPUManagerUncoreCacheAlign
is enabled and in case only one l3 cache is present, we would obtain the current behaviour.
|
||
- General Design | ||
- Logic Elaboration | ||
Try to allocate cpus sharing the same cache if demand is larger than one core. Add L3 cache affinity before tring core affinity best-fit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: typo tring
-> trying
|
||
![design_overview](design_overview.png "design_overview") | ||
|
||
- feature-gates `CPUManagerUncoreCacheAlign` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like some formatting issue in the way this line is rendered?
@swatisehgal , thanks, and modifies are updated. |
- Add `CPUManagerUncoreCacheAlign` to kubelet's feature-gates to enable(true)/disable(false) the feature. | ||
- Also, more than one l3 cache should exist in a single socket/package. | ||
|
||
- C1: Add `CPUManagerUncoreCacheAlign` to kubelet's feature-gates to enable(true)/disable(false) the feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this feature (per my previous comment, see history) will need to depend on a feature gate. But you don't necessarily need a separate feature gate, you can depend on the one we added on #2626.
This CPUManager feature gate will most likely be alpha in the 1.23 cycle, which fit the needs of this KEP.
I think you can conditionally enable this optimization depending on a cpu manager policy options. This way you keep the conditional logic you already need to support the feature gate, without extra burden.
Fitting in the new CPUManager policy options is also a very nice and clean design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi, @fromanirh since i've post a patch kubernetes/kubernetes#102307, it maybe help you to understand why i choose a kubelet running option other than a separate cpu manager policy. IMHO, attach to a policy is also optional, but it will cause some redundant logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the implementation. I don't see yet where we need redundant logic, however. We could:
- get the enable/disable flag from cpumanager options. Probably the option should be on by default
- propagate the flag down to the
cpuAccumulator
- consume the flag into
isUncoreCacheAlignEnabled
IMHO this is also a nicer and more integrated design.
Now: implementation wise, this flow seems compliant with the production-readiness review (see the details in https://kubernetes.slack.com/archives/CPNHUMN74/p1620312071045800) because new features should depend on A compatible and related feature gate; a new feature gate would be alpha level, and the cpuManagerOptions
feature gate will still be alpha in 1.23.
So reshaping to use the cpuManagerOptions still seems a totally valid option and I think is cleaner from overall design perspective.
Let's see what other reviewers (@klueska :) ) think, and please let me know if there are concerns or requirements that I missed.
- Also, more than one l3 cache should exist in a single socket/package. | ||
|
||
- C1: Add `CPUManagerUncoreCacheAlign` to kubelet's feature-gates to enable(true)/disable(false) the feature. | ||
- C2: More than one l3 cache should exist in a single socket/package(uncore-cache exists). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what "C1" and "C2" mean in this context.
Thanks for making LLC cache cadvisor aware first. Here is some native questions form the top of my head since I didn't find the answer from the first pass of the KEP:
|
Id int `json:"core_id"` | ||
Threads []int `json:"thread_ids"` | ||
Caches []Cache `json:"caches"` | ||
+ UncoreCaches []Cache `json:"uncore_caches"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this looks like a unnecessary/confusing kludge. Would it be possible to present this similarly to how the linux kernel does, i.e. add the information about cpus in the Cache
structure (like /sys/devices/system/cpu/cpu*/cache/index*/shared_cpu_list
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @marquiz, it is an interesting problem. Previously, i've made a consideration here. And then, i want desgign here to be decoupling. But seems no other structure is more fit to place the uncore-cache information.
I think:
- For a cache, it shouldn't get knowleage of whether it is
uncore
, that's not a cache's charater. Infos in /sys/devices/system/cpu/cpu*/cache/index*/ could not tell us if the cache is uncore without information about socket/core. - But, core is awareness of the cache and uncore-cache it uses, and for a core, it include {id,threads, caches, uncore caches}.
Dicussion is welcomed, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ranchothu , in my understanding you'd like to differentiate between node level cache (typically on Intel chips) and uncore cache (typically on AMD chips), if so, why do we need to differentiate between them? they are both L3 cache and it seems that they can both be aligned when assign CPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I agree that we need additional layer of abstraction since number of grouped CPU per L3 cache is not a feature of only AMD's processor, for example Kunpen920 (armv8) had 4 cpu per L3 cache, and had 24 of such blocks.
Hi, @dchen1107 , thanks for your advice, and sorry for miss of the meeting.
May the explanation above answer your question. And look forword for future process.:smile: |
### Graduation Criteria | ||
#### Alpha | ||
|
||
- Implement the new policy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the implementation, strictly it is not a new policy but an enhancement against existing static policy right?
Hi! just a friendly reminder that the deadline for 1.23 KEP planning is 9th of September 2021: https://docs.google.com/document/d/1U10J0WwgWXkdYrqWGGvO8iH2HKeerQAlygnqgDgWv4E/edit# . If we want this enhancement to be in 1.23 we need some actions. sig-node is planning the 1.23 enhancements in today's meeting: https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit - please consider proposing this change in this meeting or next-week meeting. |
@ranchothu: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@enzoyes hi! are still interested in pushing forward this work? |
design for issue 2621