-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3022: graduate Pod Topology Spread MinDomains to stable #4407
Conversation
sanposhiho
commented
Jan 15, 2024
- One-line PR description: graduate Pod Topology Spread MinDomains to stable
- Issue link: Min domains in PodTopologySpread #3022
- Other comments:
approver: "@johnbelamaric" | ||
stable: | ||
approver: "@wojtek-t" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wojtek-t Hi, I just assigned it to you, but please give it to another person if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#prr-shadow
Several comments/suggestions from prr-shadow:
- Release Signoff Checklist needs to be updated to reflect reality
- Update tests coverage in unit tests, links in integration.
- Update graduation criteria, currently only alpha are checked and filled, do we addressed all the problems for beta? What are the criteria for GA?
- In
Are there any missing metrics that would be useful to have to improve observability of this feature?
you're pointing to metrics to see which plugins affect to scheduler's decisions in Filter/Score phase kubernetes#110643 which seems like it was solved, can you update the answer based on the current state of code? - Update the implementation history
milestone: | ||
alpha: "v1.24" | ||
beta: "v1.25" | ||
stable: "v1.30" | ||
disable-supported: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still support disabling it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/lgtm
We did good progress with the new metric :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small comments from the PRR.
|
||
#### GA (v1.30): | ||
|
||
- [x] No particular issue is reported to this feature for a certain length of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't comment on unchanged lines, so commenting here:
- "Are there any tests for feature enablement/disablement?"
Please fill in. In particular, this comment from the KEP template is relevant:
Additionally, for features that are introducing a new API field, unit tests that
are exercising the `switch` of feature gate itself (what happens if I disable a
feature gate after having objects written with the new field) are also critical.
You can take a look at one potential example of such test in:
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we missed such test and we only did the manual testing for switching the feature gate.
I created the PR to add the integration test in kubernetes/kubernetes#123115. However if we will remove the feature gate when reaching GA, the PR is no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update with the text like this then [agree it's too late now...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wojtek-t Fixed 🙏
Issue: https://github.com/kubernetes/kubernetes/issues/110643 | ||
It was lacking the way to see which filter plugin affected Pod's scheduling results in metrics | ||
(https://github.com/kubernetes/kubernetes/issues/110643), | ||
and we've got `plugin_evaluation_total`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a lacking metric anymore. Let' move it to "How can someone using this feature know that it is working for their instance?" question - as it actually helps with that.
Please also better describe this metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Rethinking about the comment from @sftim. Do we normally keep the feature gate for a few releases even after the feature reached GA? I checked other feature gates but some of them are remaining even though they're already GA, I'm not sure those are just forgotten to be removed, or intentionally kept for a few releases to keep support of the disablement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit.
|
||
#### GA (v1.30): | ||
|
||
- [x] No particular issue is reported to this feature for a certain length of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update with the text like this then [agree it's too late now...]
Yes, we do keep the feature gate for a couple of releases, so that someone that a manifest that has It is common that we forget to remove the gate. |
Got it, so we keep the feature gate not to break users' configuration but not support the disablement actually. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, sanposhiho, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |