Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Validate manifests with API server dry run #2282

Closed
stefanprodan opened this issue Jul 22, 2019 · 14 comments
Closed

Validate manifests with API server dry run #2282

stefanprodan opened this issue Jul 22, 2019 · 14 comments

Comments

@stefanprodan
Copy link
Member

stefanprodan commented Jul 22, 2019

Starting with Kubernetes 1.13 the API dry run is enabled by default. Flux could run kubectl apply --server-dry-run before trying to apply the manifest. We could log the validation errors in such a way that's easy to detect with a log parser like Fluentd/CloudWatch/Stackdriver/etc (#1340). We could also expose a Prometheus metric with the validation errors count (#2199).

To avoid custom resources no match errors, the validation and apply should be done in stages:

  • extract all CRDs from the manifest
  • run server-dry-run on the CRDs
  • if the validation succeeds apply the CRDs
  • run server-dry-run on all manifest
  • if the validation succeeds apply all manifests
@stefanprodan stefanprodan added enhancement blocked-needs-validation Issue is waiting to be validated before we can proceed labels Jul 22, 2019
@stefanprodan stefanprodan changed the title Validate manifest with API server dry run Validate manifests with API server dry run Jul 22, 2019
@stefanprodan
Copy link
Member Author

@squaremo @hiddeco should we proceed with applying the manifests if the dry run fails?

@hiddeco
Copy link
Member

hiddeco commented Jul 22, 2019

I am familiar with the --server-dry-run flag but not with the logic behind it, is the server side validation output guaranteed the same as (failing) to apply it?

@stefanprodan
Copy link
Member Author

I think it behaves the same as apply:

Every stage runs as normal, except for the final storage stage. Admission controllers are run to check that the request is valid, mutating controllers mutate the request, merge is performed on PATCH, fields are defaulted, and schema validation occurs. The changes are not persisted to the underlying storage, but the final object which would have been persisted is still returned to the user, along with the normal status code. If the request would trigger an admission controller which would have side effects, the request will be failed rather than risk an unwanted side effect.

See:

@stefanprodan
Copy link
Member Author

I think the server dry run should be opt-in via a Flux command flag. Not every validation controller has support for it e.g. open-policy-agent/gatekeeper#128

@hiddeco
Copy link
Member

hiddeco commented Jul 22, 2019

If the request would trigger an admission controller which would have side effects, the request will be failed rather than risk an unwanted side effect.

This is a big ➕ compared to what we have now, and given that there is no difference, it would not make sense to still try to apply the resources that would fail.

Question remains if we want to apply a partial set (by filtering out what makes it fail), or skip the whole apply. I am inclined to choose for the latter as we strive to maintain a valid state.

@stefanprodan
Copy link
Member Author

I vote for skipping the apply all together if the validation fails.

@stefanprodan
Copy link
Member Author

stefanprodan commented Jul 22, 2019

Looks like we need a two stage validation/apply procedure since the custom resources will fail if the CRDs are not applied.

CRD + CR:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tests.k8s.io
  annotations:
    helm.sh/resource-policy: keep
spec:
  group: k8s.io
  version: v1
  versions:
    - name: v1
      served: true
      storage: true
  names:
    plural: tests
    singular: test
    kind: Test
    categories:
      - all
  scope: Namespaced
---
apiVersion: k8s.io/v1
kind: Test
metadata:
  name: test
  namespace: test
spec:
  some: value

Dry run result:

kubectl apply --server-dry-run -f ./test.yaml
customresourcedefinition.apiextensions.k8s.io/tests.k8s.io created (server dry run)
error: unable to recognize "test.yaml": no matches for kind "Test" in version "k8s.io/v1"

@kaspernissen
Copy link
Contributor

We've encountered the problem of deployments failing because of validation errors a couple of times - and not having a good way to communicate back to the right person, has been a bit problematic.

Logging this in an easy detectable way would be a great first step. Would it be possible to consider an option with a webhook also?

We have a service; release-manager which is responsible for moving files around in git, and also report back to developers with progress etc. If we could configure flux to trigger a webhook in our release-manager and have it communicate directly to our developers with the problem via, e.g. Slack, instead of having them to inspect our log management tool. That would be pretty cool.

@hiddeco hiddeco removed the blocked-needs-validation Issue is waiting to be validated before we can proceed label Oct 17, 2019
@tobias-jenkner
Copy link

In order to validate the content of a commit before pushing it to our gitops master branch (e.g. in a pull request) I would find it very helpful to be able to call fluxd in a dryrun only way. Could that be possible as well ?

@marshallford
Copy link
Contributor

@tobias-jenkner, I'm in the same boat. I'd like to pass the dry run output (from a CLI command?) to kubeval in a CI pipeline.

@chaliy
Copy link

chaliy commented Aug 24, 2020

What we thought that could be a good idea, is to have flux --dry-run or even plugin some more validations for other than source branch, in this case, it could be integrated to git-flow process, like

  • create branch feature/xxxx1
  • commit changes to the branch, get flux validated it
  • ideally even integrated to GitHub status checks
  • then team reviews changes
  • then merge to source branch, and got synced to target cluster

So basically we are also in the same boat.

@stefanprodan
Copy link
Member Author

The API server dry-run was implemented in the GitOps toolkit and can be enabled with validation: server https://toolkit.fluxcd.io/components/kustomize/kustomization/

@jamiezieziula
Copy link

I dont see any way to enable this under the documentation linked, has this functionality changed?

@kingdonb
Copy link
Member

@jamiedick Yes, it has changed since Flux 0.18 – the validation is a required part of Server-Side Apply, so it is enabled by default now and cannot be disabled anymore.

So the validation: server setting is implied, and although the field still exists in the spec, it is a vestigial param now and changing it to validation: client or validation: none does not have any effect.

The field was left in place to smooth upgrading, if you find it in the API docs it should say something to this effect:

$ kubectl explain kustomization.spec.validation
KIND:     Kustomization
VERSION:  kustomize.toolkit.fluxcd.io/v1beta2

FIELD:    validation <string>

DESCRIPTION:
     Deprecated: Not used in v1beta2.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants