Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support maxUnavailable in rollout #65

Merged
merged 9 commits into from
Apr 4, 2024

Conversation

kerthcet
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it

Which issue(s) this PR fixes

part of #3

Special notes for your reviewer

Does this PR introduce a user-facing change?

Support maxUnavailable in rollout

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 19, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kerthcet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and liurupeng March 19, 2024 21:43
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 19, 2024
@kerthcet
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 19, 2024
@kerthcet kerthcet force-pushed the feat/support-roll-2 branch 2 times, most recently from 31dd616 to 15ed653 Compare March 25, 2024 22:23
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 25, 2024
@kerthcet kerthcet changed the title [WIP] Support maxUnavailable in rollout Support maxUnavailable in rollout Mar 25, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 25, 2024
@kerthcet kerthcet force-pushed the feat/support-roll-2 branch 2 times, most recently from 317dc4b to f82172a Compare March 25, 2024 22:35
@kerthcet
Copy link
Contributor Author

Main changes:

  • The general mechanism is we'll have a templateRevisionHash label for leaderStatefulst, leaderPod, workerStatefulset, workerPod, it will be used to verify that this is the new revision, so when leaderTemplate or workerTemplate is updated, we'll trigger a rolling update of the leader statefulset, which means a group of pods(leader+worker) will be updated the same time.
  • The rolling update number of replicas is defined by maxUnavailable, which means when we set maxUnavailable to 2, and we have a lws with 5 replicas, the rolling update index will looks like: 4,3 -> 2,1, -> 0.

Other updates:

  • the maxUnavailable is default to 1 because 0 is not allowed in Statefulset
  • the maxSurge(implementation not included in this PR) is default to 0, then the default behavior will rolling update a replica each time, I think this is more cautious.

Another concern is about the leaderWorkerTemplate.size, I have an implementation locally, it behaves as when we increate or decrease the size, it will not lead to the statefulset rollingupdate, only scalilng up/down the worker pods. But I may need your inputs for this, and this is now conflict with several things:

  • we have a size annotation in the leaderPod, which will lead to the statefulset rolling update each time we change the size
  • the restart policy RecreateGroupOnPodRestart is somehow conflict with decreasing the size because when a pod is deleted, the leaderPod will also be deleted.

so I remove the rolling update for size update right now, and make it immutable as original.

Please take a look and sorry for the slow response because I'm right at kubecon last week. cc @ahg-g @liurupeng

@kerthcet
Copy link
Contributor Author

/hold cancel
we may need e2e test as well. I'll add the e2e test framework later.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 25, 2024
@liurupeng
Copy link
Collaborator

@kerthcet we have someone working on the e2e test already, that work is in progress, thanks!

@kerthcet kerthcet force-pushed the feat/support-roll-2 branch from a7e6bfc to a7db8d3 Compare March 26, 2024 06:25
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 26, 2024

return nil
}

// updates the condition of the leaderworkerset to either Progressing or Available.
func (r *LeaderWorkerSetReconciler) updateConditions(ctx context.Context, lws *leaderworkerset.LeaderWorkerSet) (bool, error) {
Copy link
Collaborator

@liurupeng liurupeng Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq, will you add the lws.status.updatedReplica in the next PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For functionality, seems not that necessary, but for visibility, seems useful, maybe I'll add it later, but on my original design, I didn't have the plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: I'll add the field.

templateHash := utils.LeaderWorkerTemplateHash(lws)
stop:
for i := replicas - 1; i >= 0; i-- {
for j := len(stsList.Items) - 1; j >= 0; j-- {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq, why we traverse the stsList from the last to the first one as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because at first I though the items will start from the last index to 0, which will accelerate the calculation, but seems not that case, but I didn't change the implementation for lazy reason.
Anyway I think it doesn't make any difference? 😅

@liurupeng
Copy link
Collaborator

liurupeng commented Mar 28, 2024

Thanks @kerthcet , this is a great implementation! Agree we should allow updating size in the next change, (this change is already large enough). Regarding updating the size will trigger the StatefulSet upgrade, could you elaborate more on this? When upgrade happens, both the template hash label and size annotation will change on the leader statefulset at the same time right? Will the leader statefulSet start a rolling update again and what is the behavior you saw?

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 1, 2024
Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really good, thanks Kante!

@kerthcet kerthcet force-pushed the feat/support-roll-2 branch from e4473ca to f25bd10 Compare April 2, 2024 07:54
@kerthcet
Copy link
Contributor Author

kerthcet commented Apr 2, 2024

/test pull-lws-test-integration-main

1 similar comment
@kerthcet
Copy link
Contributor Author

kerthcet commented Apr 2, 2024

/test pull-lws-test-integration-main

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 2, 2024
@ahg-g
Copy link
Contributor

ahg-g commented Apr 3, 2024

so I remove the rolling update for size update right now, and make it immutable as original.

I think an immutable size is fine and a reasonable tradeoff to simplify the rolling update logic

@kerthcet kerthcet force-pushed the feat/support-roll-2 branch from c6408ba to b906059 Compare April 3, 2024 03:51
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2024
@kerthcet kerthcet force-pushed the feat/support-roll-2 branch from 311c0f4 to 7ff79a2 Compare April 3, 2024 08:49
@kerthcet
Copy link
Contributor Author

kerthcet commented Apr 3, 2024

Last 2 commit:

  • Add one e2e test
  • Solve the performance issue when calculating the partition field

Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits, otherwise this is good from my end!

testing.UpdateWorkerTemplate(ctx, k8sClient, lws)

// Wait for leaderWorkerSet ready again.
testing.ExpectLeaderWorkerSetAvailable(ctx, k8sClient, lws, "All replicas are ready")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we ensure that this is checking the availability status after the update? isn't it possible that we are still checking the previous status before the lws got a chance to pick the update and change the availability status to false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this down then we'll check the hash first to make sure we're validating the new revision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this will make a difference, it would be nice if we can check that the template hash changed, and then check that the lws is available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already do as you said, so the process is:

  • update the lws
  • ExpectValidLeaderStatefulSet will check that the template hash is already the updated one, see
    hash := utils.LeaderWorkerTemplateHash(&lws)
    if sts.Labels[leaderworkerset.TemplateRevisionHashKey] != hash {
    return fmt.Errorf("mismatch template revision hash for leader statefulset, got: %s, want: %s", sts.Spec.Template.Labels[leaderworkerset.TemplateRevisionHashKey], hash)
    }
  • ExpectLeaderWorkerSetAvailable validate the lws is available.

Signed-off-by: kerthcet <[email protected]>
@kerthcet kerthcet force-pushed the feat/support-roll-2 branch from 75bc593 to 2b9da9c Compare April 4, 2024 04:46
@kerthcet
Copy link
Contributor Author

kerthcet commented Apr 4, 2024

All set @ahg-g

@liurupeng
Copy link
Collaborator

lgtm as well, I will wait for @ahg-g to approve

@ahg-g
Copy link
Contributor

ahg-g commented Apr 4, 2024

/label tide/merge-method-squash
/lgtm
/hold cancel

Great work!

@k8s-ci-robot k8s-ci-robot added tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Apr 4, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 4, 2024
@k8s-ci-robot k8s-ci-robot merged commit 59f5a76 into kubernetes-sigs:main Apr 4, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants