Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3022: graduate Pod Topology Spread MinDomains to stable #4407

Merged
merged 6 commits into from
Feb 7, 2024

Conversation

sanposhiho
Copy link
Member

  • One-line PR description: graduate Pod Topology Spread MinDomains to stable
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 15, 2024
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Jan 15, 2024
@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 15, 2024
@sanposhiho
Copy link
Member Author

/cc @alculquicondor @Huang-Wei

approver: "@johnbelamaric"
stable:
approver: "@wojtek-t"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t Hi, I just assigned it to you, but please give it to another person if needed.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#prr-shadow
Several comments/suggestions from prr-shadow:

  1. Release Signoff Checklist needs to be updated to reflect reality
  2. Update tests coverage in unit tests, links in integration.
  3. Update graduation criteria, currently only alpha are checked and filled, do we addressed all the problems for beta? What are the criteria for GA?
  4. In Are there any missing metrics that would be useful to have to improve observability of this feature? you're pointing to metrics to see which plugins affect to scheduler's decisions in Filter/Score phase kubernetes#110643 which seems like it was solved, can you update the answer based on the current state of code?
  5. Update the implementation history

milestone:
alpha: "v1.24"
beta: "v1.25"
stable: "v1.30"
disable-supported: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still support disabling it?

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 28, 2024
@sanposhiho
Copy link
Member Author

@soltysh @sftim Updated based on your suggestions, thanks.

Copy link
Member

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

We did good progress with the new metric :D

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 1, 2024
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small comments from the PRR.


#### GA (v1.30):

- [x] No particular issue is reported to this feature for a certain length of time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't comment on unchanged lines, so commenting here:

  1. "Are there any tests for feature enablement/disablement?"

Please fill in. In particular, this comment from the KEP template is relevant:

Additionally, for features that are introducing a new API field, unit tests that
are exercising the `switch` of feature gate itself (what happens if I disable a
feature gate after having objects written with the new field) are also critical.
You can take a look at one potential example of such test in:
https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we missed such test and we only did the manual testing for switching the feature gate.
I created the PR to add the integration test in kubernetes/kubernetes#123115. However if we will remove the feature gate when reaching GA, the PR is no longer needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update with the text like this then [agree it's too late now...]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t Fixed 🙏

Issue: https://github.com/kubernetes/kubernetes/issues/110643
It was lacking the way to see which filter plugin affected Pod's scheduling results in metrics
(https://github.com/kubernetes/kubernetes/issues/110643),
and we've got `plugin_evaluation_total`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a lacking metric anymore. Let' move it to "How can someone using this feature know that it is working for their instance?" question - as it actually helps with that.
Please also better describe this metric.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@sanposhiho
Copy link
Member Author

sanposhiho commented Feb 4, 2024

@alculquicondor

Rethinking about the comment from @sftim.
#4407 (comment)

Do we normally keep the feature gate for a few releases even after the feature reached GA? I checked other feature gates but some of them are remaining even though they're already GA, I'm not sure those are just forgotten to be removed, or intentionally kept for a few releases to keep support of the disablement.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 4, 2024
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor nit.


#### GA (v1.30):

- [x] No particular issue is reported to this feature for a certain length of time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update with the text like this then [agree it's too late now...]

@alculquicondor
Copy link
Member

Do we normally keep the feature gate for a few releases even after the feature reached GA? I checked other feature gates but some of them are remaining even though they're already GA, I'm not sure those are just forgotten to be removed, or intentionally kept for a few releases to keep support of the disablement.

Yes, we do keep the feature gate for a couple of releases, so that someone that a manifest that has MyFeature=true continues to run. But we ignore the feature gate in code.

It is common that we forget to remove the gate.

@pnbrown pnbrown mentioned this pull request Feb 5, 2024
12 tasks
@sanposhiho
Copy link
Member Author

Got it, so we keep the feature gate not to break users' configuration but not support the disablement actually.

@wojtek-t
Copy link
Member

wojtek-t commented Feb 7, 2024

/lgtm
/approve PRR

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, sanposhiho, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 7, 2024
@k8s-ci-robot k8s-ci-robot merged commit fd4dc6c into kubernetes:master Feb 7, 2024
4 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.30 milestone Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants