Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mega Issue: Node Disruption Lifecycle Taints #624

Open
1 of 5 tasks
njtran opened this issue Oct 20, 2023 · 9 comments
Open
1 of 5 tasks

Mega Issue: Node Disruption Lifecycle Taints #624

njtran opened this issue Oct 20, 2023 · 9 comments
Labels
deprovisioning Issues related to node deprovisioning kind/feature Categorizes issue or PR as related to a new feature. v1 Issues requiring resolution by the v1 milestone

Comments

@njtran
Copy link
Contributor

njtran commented Oct 20, 2023

Description

What problem are you trying to solve?
Karpenter has driven disruption of nodes through annotations and processes maintained in memory.

Karpenter should drive disruption by through its own taint mechanism(s) while it discovers and executes disruption actions.

This issue proposes that each node owned by Karpenter will be in one of four states:

  1. Not Disrupting (No Taints) - Karpenter doesn't want to disrupt this node, and neither does the user.
  2. Candidate (PreferNoSchedule Taint) - Karpenter identifies a node as a possible option for disruption for any of the programmatic disruption mechanisms that Karpenter does - expiration, drift, consolidation. A node that's chosen as a candidate can always be removed from candidacy.
  3. Disrupting (NoSchedule Taint) - Karpenter has validated and executed the disruption action for the node, and has begun the standard flow.
    Karpenter can fail to disrupt a node. If it does, the node will go back to Not Disrupting, where it may be picked up as a Candidate again later.
  4. Terminating (NoExecute Taint) - Karpenter has deleted the node, triggering the finalization logic, where the last of the pods (e.g. Daemonsets) need to be evicted before terminating the underlying instance, then removing the node.
    Once a node has begun terminating, there's no turning back. Karpenter will eventually terminate it.

Related Issues:

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@Legion2
Copy link

Legion2 commented Dec 5, 2023

I really like the idea of this issue. This will solve the spread out behavior of default scheduler when there is are continuously added new pods but there is much idle capacity. With the described behavior karpenter would taint some of the nodes with PreferNoSchedule and will cause the scheduler perform bin packing of the new pods on the remaining nodes instead of distributing them across all underutilized nodes.
I hope there will be policies or configurations in place that allow Karpenter to identify nodes as disruption candidates even though they are still running some small jobs which can not be evicted.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 4, 2024
@njtran njtran removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 12, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 10, 2024
@jmdeal
Copy link
Member

jmdeal commented Jun 10, 2024

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 10, 2024
@Nuru
Copy link

Nuru commented Jul 1, 2024

Please be sure to handle the use case where a Pod running on a Node adds a "do-not-evict" annotation while it is running. Of course there will be an unavoidable race condition, but it is important to realize that just because the Node is tainted, it does not mean that annotated Pods will not appear on the Node.

It would be good for my use case if there were a way for a Pod to get notified that Karpenter is considering consolidating the node (NoSchedule Taint added) so it can immediately decide to either quit or annotate itself, which will give the Pod a head start in the race and avoid most if not all real-world mishaps.

One way to do this would be via another annotation, such as ok-to-distrupt or prefer-to-disrupt or something, that tells Karpenter to send the pod some Signal other than SIGTERM that the Pod can respond to (and by default would ignore) when Karpenter considers the Node a likely consolidation target. This would have to be after the Node is tainted, so that when the Pod quits and is immediately replaced with a new Pod by the Deployment, the new Pod does not get scheduled onto the same Node. We would also want a configurable delay between the taint and notification in step 3 and the actual termination in step 4, so we can be sure to give enough time for the Pod to respond and block termination.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 29, 2024
@Nuru
Copy link

Nuru commented Oct 30, 2024

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 30, 2024
@riyas-rawther
Copy link

/remove-lifecycle rotten

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprovisioning Issues related to node deprovisioning kind/feature Categorizes issue or PR as related to a new feature. v1 Issues requiring resolution by the v1 milestone
Projects
None yet
7 participants