Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for scaling up with ZeroToMaxNodesScaling option #5826

Merged
merged 6 commits into from
Jun 30, 2023

Conversation

hbostan
Copy link
Contributor

@hbostan hbostan commented May 31, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds a node group AtomicScaleUp option, that allows for all-or-nothing scale up of the node group.

Which issue(s) this PR fixes:

N/A

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Add AtomicScaleUp option that allows all-or-nothing scale up of node groups.
In atomic scale-ups, the node group should be scaled up to max size (without partial scale-ups).

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 31, 2023
@k8s-ci-robot k8s-ci-robot requested a review from feiskyer May 31, 2023 15:09
@k8s-ci-robot k8s-ci-robot requested a review from x13n May 31, 2023 15:09
@hbostan
Copy link
Contributor Author

hbostan commented May 31, 2023

CC @x13n @kawych

@Bryce-Soghigian
Copy link
Member

So the purpose of this feature is we just want to scale up to the max when the flag is set to true? How should scale down be defined for this feature? What is the condition for atomic scale up? Just for a normal scale operation but we scale the entire nodepool up to max count?

@hbostan
Copy link
Contributor Author

hbostan commented Jun 1, 2023

So the purpose of this feature is we just want to scale up to the max when the flag is set to true?

Correct, when the flag is set even if only a single node is sufficient for the pod we scale up to the max.

How should scale down be defined for this feature?

Please take a look at #5695

What is the condition for atomic scale up? Just for a normal scale operation but we scale the entire nodepool up to max count?

I don't quite understand the question, it is like a normal scale operation only difference is if the node group has the atomic option we scale up the whole group.

@hbostan hbostan force-pushed the master branch 2 times, most recently from e98c397 to 639253e Compare June 22, 2023 10:45
@MaciekPytel
Copy link
Contributor

Can we rename this mechanism? My expectation for 'atomic scaling' would be along the lines of handling a set of pods (perhaps a job?) in a single scale-up operation on a single NodeGroup or something along those lines.

This is very far from what I'd expect and I can easily imagine someone enabling this feature based on their intuition around the name of the flag and getting very surprised with a large bill after their nodegroup scaled to max. Maybe something like MaxOrNothingScaling?

@BigDarkClown
Copy link
Contributor

I like it, the solution is much more elegant now. Great work!
/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Jun 29, 2023
@hbostan
Copy link
Contributor Author

hbostan commented Jun 29, 2023

Can we rename this mechanism?

Renamed the mechanism to ZeroOrMaxNodeScaling. I think this is a bit more descriptive and clear than 'atomic scaling' about what the mechanism does.

@BigDarkClown
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 29, 2023
@MaciekPytel
Copy link
Contributor

/approve
/hold
@hbostan Can you update release note to match the new name of the feature? Once you do please feel free to unhold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 29, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hbostan, MaciekPytel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 30, 2023
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 30, 2023
@hbostan hbostan changed the title Add support atomic scale up for node groups Add support for scaling up with ZeroToMaxNodesScaling option Jun 30, 2023
@BigDarkClown
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 30, 2023
@BigDarkClown
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2023
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2023
hbostan added 6 commits June 30, 2023 11:17
* Merged multiple tests into one single table driven test.
* Fixed some typos.
…strator

* Started handling scale up options for ZeroToMaxNodeScaling with the existing estimator
* Skip setting similar node groups for the node groups that use
  ZeroToMaxNodeScaling
* Renamed the autoscaling option from "AtomicScaleUp" to "AtomicScaling"
* Merged multiple tests into one single table driven test.
* Fixed some typos.
* Renamed the "AtomicScaling" autoscaling option to
  "ZeroOrMaxNodeScaling" to be more clear about the behavior.
@BigDarkClown
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2023
@hbostan
Copy link
Contributor Author

hbostan commented Jun 30, 2023

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 30, 2023
@k8s-ci-robot k8s-ci-robot merged commit e6397c6 into kubernetes:master Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants