Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add KEP for pod ready++ #1991

Merged
merged 1 commit into from
Apr 11, 2018
Merged

add KEP for pod ready++ #1991

merged 1 commit into from
Apr 11, 2018

Conversation

freehan
Copy link
Contributor

@freehan freehan commented Mar 30, 2018

KEP for pod ready++

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 30, 2018
@k8s-github-robot k8s-github-robot added sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/network Categorizes an issue or PR as relevant to SIG Network. labels Mar 30, 2018
@thockin thockin self-assigned this Mar 30, 2018
@jbeda
Copy link
Contributor

jbeda commented Apr 1, 2018

I encourage you to get this PR submitted once @thockin and/or @dchen1107 approve of the scope. Err on the side of smaller targeted PRs that take on a single topic. Once things have settled create a PR to move it from "provisional" to "implementable". That provides a "last call" to make sure that everyone is on the same page. The goal is to avoid 100+ comment PRs that tackle all topics at once.


### Implementation Details/Notes/Constraints

This proposal mostly involves kubelet changes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would detail that the ready condition must continue to be the FINAL signal for controllers to proceed (for compat) which complicates the design overall.

MAYBE also add a short blurb about all-containers-ready (e.g. for endpoints) vs pod-ready? Maybe that belongs in the doc (but sharing docs with community sucks)


## Proposal

[K8s Proposal: Pod Ready++](https://docs.google.com/document/d/1VFZbc_IqPf_Msd-jul7LKTmGjvQ5qRldYOFV0lGqxf8/edit#)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth converting that to markdown to be pasted inline here? What is the precedent @calebamiles ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for adding it here at least once the outstanding comments on the doc have been resolved - I think there's value in consistent use of git source control for tracking changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for markdown here in git


## Table of Contents

A table of contents is helpful for quickly jumping to sections of a KEP and for highlighting any additional information provided beyond the standard KEP template.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably can drop the "Why we need a table of contents" text here :)


## Proposal

[K8s Proposal: Pod Ready++](https://docs.google.com/document/d/1VFZbc_IqPf_Msd-jul7LKTmGjvQ5qRldYOFV0lGqxf8/edit#)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for adding it here at least once the outstanding comments on the doc have been resolved - I think there's value in consistent use of git source control for tracking changes.


### Goals

- Allow extra signals for pod readiness.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps there are a set of signals that we should implement as part of completing this KEP and as part of proving the API?

Copy link
Contributor Author

@freehan freehan Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a user adoption item in the graduation criteria. I would not tied this proposal with specific feature for now.


## Summary

This proposal aims to add extensibility to pod readiness. Besides container readiness, external feedback can be injected into PodStatus and influence pod readiness. Thus, achieving pod “ready++”.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better summary is to recognize that what will really happen is to split the concept of "pod ready" into two pieces, for two different sets of consumers. The current text focuses one one set of consumers, the "workload controllers" that want "pod ready ++". The other set of consumers is the load balancers, network policy implementors, and so on that contribute to the "pod ready ++" based on "plain old pod ready". Remember, a load balancer (for example) has to react to a pod before all the load balancers and whatever have reacted to the pod. Following is a suggested rewording.

This proposal splits the concept of pod readiness into two concepts, for two distinct sets of consumers. One set of consumers is the workload controllers and other clients that simply want to know when a pod is fully ready for use. They care about a concept of readiness that we may call "ready++" and is stricter than the concept that is supported today; ready++ also indicates that all the relevant load balancers, traffic filters, and whatever have been adjusted to account for the pod. The other set of consumers is those very load balancers, traffic filters, and whatever. They need to continue to react to today's looser concept of readiness, while contributing to the stricter concept. For example, a load balancer needs to put a new pod into its backend set without waiting for ready++, while also contributing its own bit into ready++.

@MikeSpreitzer
Copy link
Member

This propsoal should explicitly explain how load balancers, NetworkPolicy implementations, and so on will continue to sense "plain old pod ready".

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 5, 2018
@freehan
Copy link
Contributor Author

freehan commented Apr 5, 2018

Ported the proposal from google doc into the PR. PTAL

For the workloads that take pod readiness as a critical signal for its decision making, they will automatically comply with this proposal without any change. Majority, if not all, of the workloads satisfy this condition.

##### Kubelet
- Use PATCH instead of PUT to update PodStatus fields that are dictated by kubelet.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of using Patch instead of Put here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid conflicts. Similar to node status.

- lastProbeTime: null
lastTransitionTime: 2018-01-01T00:00:00Z
status: "False"
type: www.example.com/feature-1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this documented somewhere on how to create custom Conditions ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch is the way to go.


Pod readiness indicates whether the pod is ready to serve traffic. Pod readiness is dictated by kubelet with user specified readiness probe. On the other hand, pod readiness determines whether pod address shows up on the address list on related endpoints object. K8s primitives that manage pods, such as Deployment, only takes pod status into account for decision making, such as advancement during rolling update.

For example, during deployment rolling update, a new pod becomes ready. On the other hand, service, network policy and load-balancer are not yet ready for the new pod due to whatever reason (e.g. slowness in api machinery, endpoints controller, kube-proxy, iptables or infrastructure programming). This may cause service disruption or lost of backend capacity. In extreme cases, if rolling update completes before any new replacement pod actually start serving traffic, this will cause service outage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this cause loss of backend capacity ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pod ready does not mean pod is serving traffic. Hence loss of backend capacity.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freehan I mean how is this part better than without this feature . Overall I agree just want to understand the reasoning here

```

Another pod condition `ContainerReady` will be introduced to capture the old pod `Ready` condition.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContainersReady (plural) I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there may be on one container in pod :P

Custom pod condition can be injected thru PATCH action using KubeClient. Please be noted that “kubectl patch” does not support patching object status. Need to use client-go or other KubeClient implementations.

Naming Convention:
The type of custom pod condition must comply with k8s label key format. For example, “www.example.com/feature-1”.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we allow "naked" named or do we require the prefix?

@thockin
Copy link
Member

thockin commented Apr 11, 2018

May be worth documenting why we don't want to allow updates.

@thockin
Copy link
Member

thockin commented Apr 11, 2018

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 11, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 11, 2018
@k8s-ci-robot k8s-ci-robot merged commit 28549b7 into kubernetes:master Apr 11, 2018

Another pod condition `ContainerReady` will be introduced to capture the old pod `Ready` condition.
```
ContainerReady is true == containers are ready

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this replace the old Ready condition or in addition. ? It has to be in addition otherwise this is a breaking change

calebamiles pushed a commit to calebamiles/community that referenced this pull request Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants