Clarify behavior of parallel pod management policy of stateful sets #47085

mittal-ishaan · 2024-07-04T06:00:07Z

Problem:
I was facing the issue described in kubernetes/kubernetes#67250

The way around discussed by community users to avoid this is to set the podManagementPolicy to Parallel. As suggested here

I have tried this and it works as expected, when I update the pod template to a good configuration it terminates all pods and does not wait for pods to be Running and Ready or completely terminated before launching or terminating another Pod.

All was good until I read the documentation for podManagementPolicy further, I saw one more line stated here

This option only affects the behaviour for scaling operations. Updates are not affected.

Setting it to Parallel worked for me and when I update the configuration, it works, contradicting what the above line in the docs says.

I went through the code for it and saw

https://github.com/kubernetes/kubernetes/blob/88313a445174e21ed326f40802429b854e5be9ba/pkg/controller/statefulset/stateful_set_control.go#L436-L440

when we set podManagementPolicy to parallel, monotonic is set to false and we never enter this if block. this in turn at the end leads to updating the pods.

https://github.com/kubernetes/kubernetes/blob/88313a445174e21ed326f40802429b854e5be9ba/pkg/controller/statefulset/stateful_set_control.go#L459

Proposed Solution:
This doc change was added for the Kubernetes 1.11 version and I suppose the code has changed for it since then. I have verified that updates are indeed affected by the parallel pod management policy. We should update the docs to remove the line stating updates are not affected.

Page to Update:
https://kubernetes.io/docs/concepts/workloads/controllers/statefulset

Kubernetes Version: v1.30.0

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-07-04T06:00:15Z

This issue is currently awaiting triage.

SIG Docs takes a lead on issue triage for this website, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Ritikaa96 · 2024-07-04T06:50:57Z

/sig apps
/sig architecture
/sig scheduling
/kind bug

mittal-ishaan · 2024-09-29T17:55:36Z

Hey,
Wanted to know, if there is any update on this

ayushpatil2122 · 2024-10-07T12:23:47Z

/assign

tengqm · 2024-10-11T01:08:44Z

/sig apps

k8s-triage-robot · 2025-01-09T02:53:04Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

EronWright · 2025-02-07T00:57:37Z

I would appreciate some clarifying remarks about how the parallel policy relates to revision changes. I would guess that is what the term 'update' means here. As discussed in kubernetes/kubernetes#67250, the parallel policy seems to unblock a stuck rollout, meaning that it does affect updates. Meanwhile, according to kubernetes/kubernetes#96218, it should not affect updates. Then there's the MaxUnavailable flag to consider, since it is said to have an additional effect (though I haven't observed any relation to this issue).

I think the documentation should be changed to say more (not less) about the behavior of updates (revision changes).

### Proposed changes  This PR seeks to address this issue ([k8s: "Forced rollback"](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#forced-rollback)) that occurs when the workspace pod is in a crashloop: > When using [Rolling Updates](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#rolling-updates) with the default [Pod Management Policy](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-management-policies) (OrderedReady), it's possible to get into a broken state that requires manual intervention to repair. The `parallel` policy seems to enable the statefulset controller to forcibly remove a pod when a new revision is available. The controller seems to obey the termination grace period as is important, and I can't think of any other negatives. But there's a concern in the k8s community about this approach: kubernetes/website#47085 Note that a workspace consists of one replica, and is rather like a singleton with good behavior w.r.t. Pulumi state locking and compatible with persistent volumes. ### Related issues (optional)  Closes #801

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 4, 2024

github-project-automation bot added this to SIG Apps and SIG Scheduling Jul 4, 2024

github-project-automation bot moved this to Needs Triage in SIG Scheduling Jul 4, 2024

github-project-automation bot moved this to Needs Triage in SIG Apps Jul 4, 2024

k8s-ci-robot assigned ayushpatil2122 Oct 7, 2024

This was referenced Oct 7, 2024

Update statefulset.md #48239

Closed

Update statefulset.md #48240

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 9, 2025

This was referenced Feb 4, 2025

Use 'parallel' policy for workspace pod rollouts to avoid stalls pulumi/pulumi-kubernetes-operator#802

Merged

Workspace pod may stall during rollout pulumi/pulumi-kubernetes-operator#801

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify behavior of parallel pod management policy of stateful sets #47085

Clarify behavior of parallel pod management policy of stateful sets #47085

mittal-ishaan commented Jul 4, 2024

k8s-ci-robot commented Jul 4, 2024

Ritikaa96 commented Jul 4, 2024 •

edited

Loading

mittal-ishaan commented Sep 29, 2024

ayushpatil2122 commented Oct 7, 2024

tengqm commented Oct 11, 2024

k8s-triage-robot commented Jan 9, 2025

EronWright commented Feb 7, 2025 •

edited

Loading

Clarify behavior of parallel pod management policy of stateful sets #47085

Clarify behavior of parallel pod management policy of stateful sets #47085

Comments

mittal-ishaan commented Jul 4, 2024

k8s-ci-robot commented Jul 4, 2024

Ritikaa96 commented Jul 4, 2024 • edited Loading

mittal-ishaan commented Sep 29, 2024

ayushpatil2122 commented Oct 7, 2024

tengqm commented Oct 11, 2024

k8s-triage-robot commented Jan 9, 2025

EronWright commented Feb 7, 2025 • edited Loading

Ritikaa96 commented Jul 4, 2024 •

edited

Loading

EronWright commented Feb 7, 2025 •

edited

Loading