Cordon Drifted nodes before processing evictions #623

sidewinder12s · 2023-02-28T22:10:16Z

Tell us about your request

We currently use a tool which will roll all nodes in an ASG when the launch template configuration changes. This is usually in response to AMI Upgrades.

One of the features of this tool is that when it detects this change, it will cordon all nodes in the ASG before it starts replacing nodes. This ensures that workloads on evicted nodes do not reschedule onto nodes that are about to get killed again.

Can Karpenter implement this for it's Drift calculations?

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

We have quite a few services that do not handle restart well or rapid restarts well and would like to minimize how many times they get restarted.

Are you currently working around this issue?

No feature like this in Karpenter.

Additional Context

No response

Attachments

No response

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

jukie · 2023-03-01T19:53:50Z

Isn't this already the case?
https://github.com/aws/karpenter/blob/39b48fecef170ee725a12031a20cf72da09de860/designs/node-upgrades.md?plain=1#L26-L28

njtran · 2023-03-02T18:45:00Z

Hey @sidewinder12s, this is a similar to #622. Would this help in the expiration case as well?

sidewinder12s · 2023-03-02T18:54:55Z

I think so. I'd generally consider this ask be, if Karpenter knows it is removing a batch of nodes, it should cordon the batch before taking action to ensure we don't reschedule workloads multiple times while it is performing the change.

sidewinder12s · 2023-03-02T19:01:41Z

Isn't this already the case?

https://github.com/aws/karpenter/blob/39b48fecef170ee725a12031a20cf72da09de860/designs/node-upgrades.md?plain=1#L26-L28

My read of that section is it is not clear that Karpenter will cordon all drifted nodes before processing, or only as part of a cordon and drain action.

njtran · 2023-03-02T19:04:10Z

That makes sense. This was one of the first considerations we had in the deprovisioning logic a while ago. The idea was to not cordon the nodes to allow the capacity to be utilized in case the nodes couldn't be deprovisioned for an unknown length of time (e.g. do-not-evict pods, blocking PDBS, unable to create a replacement node). If a node could not be deprovisioned, keeping around capacity that couldn't be scheduled to could incur some unwanted costs.

All in all, I'm open to making this configurable. Can you share a bit of your requirements on expiry and drift for nodes? How important do you see deprovisioning is for a node that is expired or drifted?

sidewinder12s · 2023-03-02T20:07:34Z

The one behavior we'd had issues with in a custom batch autocaler was that our preferred use of the do-not-evict annotation was to not evict singleton pods that were processing things. However the pod would eventually finish. If we do not cordon the node, it is likely the node will never empty out so that it can be replaced because new work will keep coming in that can use it.

One of the worse uses was using it for workloads that cannot handle disruption, because it means we must take manual action to perform cluster maintenance.

I think in general we're ok letting things block actions, but I think we'd want something like MachineDisruptionGate to eventually force the controller to take action to bring the cluster back into conformance with config. We run many large clusters with many different users and as we scale we're finding it difficult to coordinate action if we give users too many toggles to block maintenance and basically requires us to move to something like RDS maintenance windows where we basically tell everyone, we'll do work during these hours if your workload cannot handle the normal action of the system. Setting a TTL on nodes as policy is another action we're taking to basically stop letting the cluster keep nodes forever (so through Policy and action making it explicit to customers that you will need to deal with disruption)

k8s-triage-robot · 2024-01-30T20:29:34Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-02-29T21:27:17Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jukie · 2024-03-03T17:17:58Z

/remove-lifecycle rotten

jmdeal · 2024-03-22T18:50:13Z

/assign

k8s-triage-robot · 2024-06-20T19:02:55Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

jmdeal · 2024-06-21T18:23:00Z

#1314 will address this issue, albeit through a slightly different mechanism. Rather than cordoning the nodes with a NoSchedule taint, Karpenter will apply a PreferNoSchedule taint to voluntary disruption candidates, and it will apply an in-memory NoSchedule taint to drifted nodes. We don't apply this in-memory taint to underutilized nodes since additional pods scheduling is the signal we use to cancel a consolidation decision. The in-memory taint combined with the PreferNoSchedule taint ensures that Karpenter will pre-spin new capacity for drifted nodes, rather than evicting and rescheduling pods to other drifted nodes.

k8s-triage-robot · 2024-07-21T19:22:48Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

jmdeal · 2024-07-22T23:49:23Z

/remove-lifecycle stale
/lifecycle frozen

Going to freeze this issue, #1314 should address this issue but there is a race condition that needs to be fixed first. Hopefully I should be able to prioritize this in the next few weeks.

andyspiers · 2024-11-20T12:48:22Z

#1314 has now been closed off without merging so where does that leave this issue?

sidewinder12s added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 28, 2023

sidewinder12s changed the title ~~Cordon Drifted nodes before shutdown~~ Cordon Drifted nodes before processing evictions Feb 28, 2023

njtran mentioned this issue Mar 2, 2023

Configurable Deprovisioning Process aws/karpenter-provider-aws#3520

Closed

ellistarn added the v1 Issues requiring resolution by the v1 milestone label Mar 25, 2023

njtran added v1.x Issues prioritized for post-1.0 and removed v1 Issues requiring resolution by the v1 milestone labels Sep 6, 2023

yangwwei mentioned this issue Sep 12, 2023

Karpenter should cordon a node once it expires aws/karpenter-provider-aws#4613

Closed

ellistarn mentioned this issue Sep 15, 2023

Pod is scheduled right before karpenter decides to delete a node aws/karpenter-provider-aws#4310

Closed

njtran added the deprovisioning Issues related to node deprovisioning label Sep 20, 2023

njtran transferred this issue from aws/karpenter-provider-aws Oct 20, 2023

njtran mentioned this issue Oct 20, 2023

Mega Issue: Node Disruption Lifecycle Taints #624

Open

5 tasks

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 30, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 29, 2024

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 3, 2024

k8s-ci-robot assigned jmdeal Mar 22, 2024

jonathan-innis mentioned this issue Apr 22, 2024

Allow partial consolidation of nodes with blocking PDBs #1176

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 20, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 21, 2024

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cordon Drifted nodes before processing evictions #623

Cordon Drifted nodes before processing evictions #623

sidewinder12s commented Feb 28, 2023

jukie commented Mar 1, 2023

njtran commented Mar 2, 2023

sidewinder12s commented Mar 2, 2023

sidewinder12s commented Mar 2, 2023

njtran commented Mar 2, 2023

sidewinder12s commented Mar 2, 2023

k8s-triage-robot commented Jan 30, 2024

k8s-triage-robot commented Feb 29, 2024

jukie commented Mar 3, 2024

jmdeal commented Mar 22, 2024

k8s-triage-robot commented Jun 20, 2024

jmdeal commented Jun 21, 2024

k8s-triage-robot commented Jul 21, 2024

jmdeal commented Jul 22, 2024

andyspiers commented Nov 20, 2024

Cordon Drifted nodes before processing evictions #623

Cordon Drifted nodes before processing evictions #623

Comments

sidewinder12s commented Feb 28, 2023

Tell us about your request

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Are you currently working around this issue?

Additional Context

Attachments

Community Note

jukie commented Mar 1, 2023

njtran commented Mar 2, 2023

sidewinder12s commented Mar 2, 2023

sidewinder12s commented Mar 2, 2023

njtran commented Mar 2, 2023

sidewinder12s commented Mar 2, 2023

k8s-triage-robot commented Jan 30, 2024

k8s-triage-robot commented Feb 29, 2024

jukie commented Mar 3, 2024

jmdeal commented Mar 22, 2024

k8s-triage-robot commented Jun 20, 2024

jmdeal commented Jun 21, 2024

k8s-triage-robot commented Jul 21, 2024

jmdeal commented Jul 22, 2024

andyspiers commented Nov 20, 2024