-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add building without in tree providers #1180
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,263 @@ | ||
--- | ||
title: Building Kubernetes Without In-Tree Cloud Providers | ||
authors: | ||
- "@BenTheElder" | ||
owning-sig: sig-cloud-provider | ||
participating-sigs: | ||
- sig-release | ||
- sig-testing | ||
reviewers: | ||
- "@spiffxp" | ||
- "@cheftako" | ||
- "@andrewsykim" | ||
- "@stephenaugustus" | ||
approvers: | ||
- "@cheftako" | ||
- "@andrewsykim" | ||
- "@spiffxp" | ||
- "@stephenaugustus" | ||
editor: TBD | ||
creation-date: 2019-07-29 | ||
last-updated: 2019-07-29 | ||
status: implementable | ||
see-also: | ||
- "/keps/sig-cloud-provider/20190125-removing-in-tree-providers.md" | ||
- "/keps/sig-cloud-provider/20180530-cloud-controller-manager.md" | ||
--- | ||
|
||
# Building Kubernetes Without In-Tree Cloud Providers | ||
|
||
## Table of Contents | ||
|
||
<!-- toc --> | ||
- [Release Signoff Checklist](#release-signoff-checklist) | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [User Stories [optional]](#user-stories-optional) | ||
- [Story 1](#story-1) | ||
- [Story 2](#story-2) | ||
- [Story 3](#story-3) | ||
- [Implementation Details/Notes/Constraints [optional]](#implementation-detailsnotesconstraints-optional) | ||
- [Risks and Mitigations](#risks-and-mitigations) | ||
- [Design Details](#design-details) | ||
- [Test Plan](#test-plan) | ||
- [Graduation Criteria](#graduation-criteria) | ||
- [Alpha -> Beta Graduation](#alpha---beta-graduation) | ||
- [Beta -> GA Graduation](#beta---ga-graduation) | ||
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | ||
- [Version Skew Strategy](#version-skew-strategy) | ||
- [Implementation History](#implementation-history) | ||
- [Drawbacks [optional]](#drawbacks-optional) | ||
- [Alternatives [optional]](#alternatives-optional) | ||
- [Infrastructure Needed [optional]](#infrastructure-needed-optional) | ||
<!-- /toc --> | ||
|
||
## Release Signoff Checklist | ||
|
||
- [x] kubernetes/enhancements issue in release milestone, which links to KEP (this should be a link to the KEP location in kubernetes/enhancements, not the initial KEP PR) | ||
- [x] KEP approvers have set the KEP status to `implementable` | ||
- [ ] Design details are appropriately documented | ||
- [x] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input | ||
- [x] Graduation criteria is in place | ||
- [ ] "Implementation History" section is up-to-date for milestone | ||
- [x] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] | ||
- [x] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes | ||
|
||
## Summary | ||
|
||
This proposal outlines a plan to enable building Kubernetes without the in-tree | ||
cloud providers in preparation for [removing them entirely](keps/sig-cloud-provider/20190125-removing-in-tree-providers.md). | ||
|
||
## Motivation | ||
|
||
The in tree cloud-provider implementations are being [removed](keps/sig-cloud-provider/20190125-removing-in-tree-providers.md) in the future, this involves a large amount | ||
of code that is used in many places in tree. In order to prepare for this eventuality | ||
it would be helpful to see what that removal entails exactly and verify that Kubernetes | ||
will continue to function correctly. Doing so is a bit tricky without ensuring | ||
that the in-tree provider code is not being used via some unexpected side-channel | ||
(such as `init()` methods). Building binaries without the in-tree cloud provider | ||
packages would allow us to verify this and additionally provide experimentally | ||
smaller / cheaper binaries for parties interested in out of tree provider or | ||
no provider based clusters. | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Goals | ||
|
||
- Enable building Kubernetes without in-tree cloud providers and without forking | ||
- Enable testing out of tree providers with a simulation of the future removal of the in-tree code. | ||
- Enable experimentation with cloud-provider-less clusters | ||
|
||
### Non-Goals | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Building the out of tree providers | ||
- Changing the official Kubernetes release builds | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there any plans to support an "unofficial" release using this build tag? I'm assuming not in the immediate future but maybe later? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll likely need one as part of any CI using it, kube-up, kops etc. tend to rely on a release tarball being somewhere in cloud storage, we can handle this the same as the other CI based builds which are definitely not official releases but are regularly pushed to well known location(s). We have a number of these already including bazel based quick release, make based quick release, and the cross compile build, I think it would be pretty reasonable to add a "without in tree providers" CI build. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I would say yes, as a follow up to getting the build tag working we'll first want to figure out a nice way to invoke building something like a release with it, and then we'll want CI for it. I think this is sorta covered by the test plan but maybe could be more explicit 🤔 |
||
- Building the e2e tests | ||
- Decoupling cloud providers is a larger problem there and not necessary to test out-of-tree providers or build smaller binaries | ||
- Mechanisms for migrating to out of tree providers | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should it be called out explicitly (besides mechanisms to migrate to out-of-tree providers) that maintaining API compatibility with existing in-tree providers (e.g. for in-tree volumes) is a non-goal in the context of this KEP and associated PRs as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe? Inherently there can't / won't be the same cloud provider and what cloud providers do exist will depend on the CCM deployed when using this. This is not new to this KEP. For storage / in-tree volumes etc. anything broken in this build mode will eventually be broken when the in-tree cloud providers are removed and must be fixed regardless of this KEP. The intention behind this KEP is to help surface what those things are and verify that we've fixed them as we do that. Fixing them however is out of scope. That said, to implement it we do have to do a bare minimum of decoupling just to get a build that compiles, and that has surfaced some pending breakages already that are called out in the KEP (E.G. GCE node IPAM which we need to finish deprecating and remove). More specific to this point I have met with @davidz627 about CSI migration testing & this KEP and we do intend to make sure that we can test CSI migration specifically, possibly not in the first implementation PR though and possibly (?) orthogonally to the KEP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for clarifying that potential fixes for issues that arise at run-time (rather than at build-time) is out of scope. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should also add -- I'm also interested in helping fix those runtime issues myself, but I don't think that belongs in the KEP 😅 |
||
- CSI Migration for in-tree Volumes is already underway in SIG Storage | ||
- External Credential Providers is being written / solved in another KEP ([#541](https://github.com/kubernetes/enhancements/issues/541)) | ||
- CCMs and overall scope for moving out of tree is in [removing-in-tree-providers](keps/sig-cloud-provider/20190125-removing-in-tree-providers.md) | ||
|
||
## Proposal | ||
|
||
We will add a [build constraints](https://golang.org/pkg/go/build/#hdr-Build_Constraints) | ||
to the cloud provider code for a pseudo "build tag" specifying not to include | ||
any in-tree provider code. This will allow compiling the binaries as normal today | ||
and simulating the removal of this code by specifying the tag at build time and | ||
triggering the build constraints on the files in these packages. | ||
|
||
Some small adjustments may be necessary to the code base to ensure that the | ||
other packages can build without depending on these packages. | ||
|
||
A prototype is available in [kubernetes/kubernetes#80353](https://github.com/kubernetes/kubernetes/pull/80353). | ||
|
||
To ensure that this continues to function we will add CI building in this mode, | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
and CI running end to end tests against it (see the test plan). | ||
|
||
### User Stories [optional] | ||
|
||
#### Story 1 | ||
|
||
As an out of tree cloud provider implementer, I want to develop and test against | ||
Kubernetes without the in tree providers. | ||
|
||
Kubernetes out of tree cloud provider developers will be able to build Kubernetes | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
in this mode and build & test their cloud-controller-manager implementations and | ||
associated tooling against this build in preparation for the actual hard removal | ||
of the in-tree providers. | ||
|
||
#### Story 2 | ||
|
||
As a developer working to replace an in-tree provider with an out-of-tree provider, | ||
I am attempting to validate that I work with KAS/KCM/Kubelet which do not have | ||
(my) in-tree cloud-provider compiled in and have successfully migrated all the | ||
functionality I need to CCM/CSI/... Using this build ensures the relevant | ||
functionality is not in KAS/KCM/Kubelet. It also allows me to work with | ||
smaller binaries. | ||
|
||
#### Story 3 | ||
|
||
As a [kind](https://github.com/kubernetes-sigs/kind) developer / user I want to | ||
use Kubernetes binaries without cloud providers for local clusters. | ||
|
||
Developers and users will be able to build local clusters leveraging this mode | ||
to not pay for cloud providers they are unable to use. | ||
|
||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
### Implementation Details/Notes/Constraints [optional] | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This is implemented using a synthetic `nolegacyproviders` tag in go build | ||
constraints on the relevant sources. If `GOFLAGS=-tags=nolegacyproviders` is set | ||
then the legacy cloud provider pacakges will be excluded from the build. | ||
|
||
In order to make this work the following additional changes are made: | ||
|
||
- Packages that we fully exclude (the legacy provider packages) _must_ contain | ||
a `doc.go` file or any other file that does NOT contain any code or build | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worth specifying standard verbiage for the doc.go, at least if its in K/K? Eg. a comment to the effect that this package is currently only used by cloud provider specific code. It will be removed from K/K as part of the cloud provider extraction effort. Do we have such packages which are not part of K/K? If so do we have mechanism for putting a doc.go in them? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
SGTM
That is a really good question, currently we do not if we're only considering things that import the in-tree legacy provider packages. Ideally we should not import packages that import these in the future... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a note that I don't see any eg: https://github.com/kubernetes/kubernetes/pull/80353/files#diff-fe522781fb7f1bf08a17a523c264ffb6 no verbiage at the moment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I'm not sure what verbiage we want, but it also seems very into the weeds on details that might best be reviewed in the implementation. Currently the implementation PR merely adds stub doc.go files containing only the license boilerplate and the Many of our packages contain them with anything from this sort of stub to actual godoc comments. |
||
constraints. Go will not allow "building" a package without any files passing | ||
the constraints, however it will happily build a package with no actual methods | ||
/ variables / ... | ||
|
||
- A few locations in the code do not properly use the cloud provider interface | ||
(instead, importing the cloud provider packages directly), | ||
some of these must be updated with both a "with provider" version and a | ||
"without provider" version broken out of the existing code. In particular this | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this cover the moving of the ProbeVolumePlugin calls to cmd/kube-controller-manager/app/plugins_legacyproviders.go? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, the next line mentions in-tree volumes |
||
includes the in-tree volumes until CSI migration is standard, and the GCE IPAM | ||
logic in kube-controller-manager. | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Note that the nodeIpamController GCE IPAM logic is slated for removal (see [the cloud controller manager KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cloud-provider/20180530-cloud-controller-manager.md)) | ||
|
||
In particular this adds tags / constraints to: | ||
- `staging/src/k8s.io/legacy-cloud-providers/*` (constraints on all the providers) | ||
- `pkg/cloudprovider` (constraints on importing the providers) | ||
- `pkg/volumes/*`, `cmd/kubelet` (versions with and without the imported providers for in-tree volumes) | ||
- `pkg/controller/nodeipam`, `cmd/kube-apiserver`, `cmd/kube-controller-manager` (with and without GCE IPAM) | ||
|
||
`test/*` is punted to a future follow up, and credential providers are punted | ||
to [the external credential provider KEP](https://github.com/kubernetes/enhancements/pull/1137). | ||
|
||
### Risks and Mitigations | ||
|
||
This is only developer facing, however we will need to ensure that these tags | ||
stay up to date if we want this build mode to continue to work (the normal | ||
build mode should work by default without any additional maintenance). | ||
|
||
To ensure this continues to work we can mitigate by: | ||
|
||
- verify in CI that the cloud provider packages have boilerplate including the | ||
build constraints | ||
- build in this mode in CI to ensure that the build succeeds | ||
|
||
## Design Details | ||
|
||
### Test Plan | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm fine with the test plan for now. Would like to see us refine this with links to the relevant jobs/dashboards as we get them setup There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, those will be pending the initial implementation, it doesn't make much sense to write CI jobs / dashboards for a build mode that doesn't exist yet. I also expect in the future to be supporting various groups implementing their own CI making use of this build mode (EG cloud-provider-foo repo CI), but I'm not sure that belongs here exactly other than the user stories. It remains to be seen which foos take this up. |
||
|
||
We will add CI to ensure that we can build with this mode enabled. | ||
|
||
Additionally, we can add CI to ensure that clusters can actually be started in | ||
this mode. | ||
|
||
Initially, [kind](https://github.com/kubernetes-sigs/kind) can be used to ensure | ||
that Kubernetes works without the providers, in the future we can extend this | ||
CI to out-of-tree providers combined with this build mode as their CI is spun up. | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Graduation Criteria | ||
|
||
##### Alpha -> Beta Graduation | ||
|
||
Likely unnecessary, as we will eventually remove the in-tree provider code entirely for [removing-in-tree-providers](keps/sig-cloud-provider/20190125-removing-in-tree-providers.md). | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This is also not a user facing change. | ||
|
||
##### Beta -> GA Graduation | ||
|
||
Likely unnecessary, as we will eventually remove the in-tree provider code entirely for [removing-in-tree-providers](keps/sig-cloud-provider/20190125-removing-in-tree-providers.md). | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This is also not a user facing change. | ||
|
||
Final graduation can be considered to be when the cloud provider code is actually | ||
removed from the Kubernetes source tree, at which point this work will be complete. | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
N/A ? | ||
|
||
### Version Skew Strategy | ||
|
||
N/A ? | ||
|
||
## Implementation History | ||
|
||
- original prototype [kubernetes/kubernetes#80353](https://github.com/kubernetes/kubernetes/pull/80353) | ||
- original KEP PR [kubernetes/enhancements#1180](https://github.com/kubernetes/enhancements/pull/1180) | ||
|
||
## Drawbacks [optional] | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
This does require maintaining these tags / constraints for the providerless build, | ||
however in the default mode without our pseudo-tag the code will build as today | ||
and require zero additional maintenance to function. As in-tree providers are | ||
relatively stable and expected not to gain new features, this should require | ||
minimal effort and can be automated to a limited extent. | ||
|
||
## Alternatives [optional] | ||
|
||
We could simply wait for the in-tree providers to be removed entirely, however | ||
this may not provide sufficient tools to adequately prepare. | ||
BenTheElder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
There is also a risk that cloud providers would each need to duplicate this | ||
work to test cloud-provider free Kubernetes for their out of tree provider. | ||
|
||
We could attempt to create a branch/PR with those changes in them. | ||
However the in-tree providers are not guaranteed to exit at the same time. | ||
So the branch/PR might have to be kept for a long period of time. | ||
In addition to being expensive the maintain such a PR/branch, it would obfuscate | ||
the effort. So developers would end up changing CP related code and have little | ||
/ no visibility that their changes were CP related. | ||
|
||
## Infrastructure Needed [optional] | ||
|
||
None? | ||
|
||
[kubernetes.io]: https://kubernetes.io/ | ||
[kubernetes/enhancements]: https://github.com/kubernetes/enhancements/issues | ||
[kubernetes/kubernetes]: https://github.com/kubernetes/kubernetes | ||
[kubernetes/website]: https://github.com/kubernetes/website | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should have at least one approver from cloud-provider and one approver from each of testing and release. We should also make sure they are not from the same company.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can be an approver on behalf of sig-testing if it'll help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to add / remove as necessary, added the current reviewers to approvers and @spiffxp to both which now cover:
@andrewsykim - (Cloud Provider, VMware)
@cheftako - (Cloud Provider, Google)
@spiffxp - (Testing, Google)
@justaugustus - (Release, VMWare)