Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3937: Declarative Validation #3938

Closed
wants to merge 11 commits into from

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Apr 3, 2023

  • One-line PR description: Declaratively validate Kubernetes built-in types and publish the validation rules in OpenAPI.
  • Other comments: I realize this is ambitious, but recent work in CEL and OpenAPI in recent releases has made this significantly more tractable. Please see the working prototype.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 3, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Apr 3, 2023
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Apr 3, 2023
@jpbetz jpbetz mentioned this pull request Apr 3, 2023
4 tasks
@jpbetz jpbetz force-pushed the declarative-validation branch from 99fa271 to 7b5f810 Compare April 4, 2023 02:32
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 4, 2023
@jpbetz jpbetz force-pushed the declarative-validation branch 6 times, most recently from b1553c6 to 4180c5f Compare April 4, 2023 05:15
@jpbetz jpbetz force-pushed the declarative-validation branch from 4180c5f to 160d26a Compare April 4, 2023 05:23
Copy link
Member

@apelisse apelisse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc, excited to see that happen!

I have a few questions that are mostly all rooting from the same place:

  • Where is this going to happen in the apiserver, before decoding, after decoding?
  • Will this use the openapi as the source of truth? Or will the code be generated like we do for defaults?
  • Will you decode into unstructured and then do the validation and then decode into built-in type?
  • Or will you run against decoded values? (i.e. null string becomes empty string for non-nil strings).
    There may be a few more questions depending on some answers :-)

[prototype](https://github.com/jpbetz/kubernetes/blob/cel-for-native/staging/src/k8s.io/sample-apiserver/pkg/apis/wardle/validation/README.md)
to get hands on experience with this proposed enhancement.

### Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected in the goals section to know if you're going to publish in v2 and/or v3. I'd love to know how you come-up with that decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use some openapi expertise on this one, should we support both? How long do we expect to keep v2 around?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, v2 is deprecated, I wouldn't care so much about it. We don't plan on using it for any of the new usage. explain uses v3, kube-verify uses v3, everything should use v3.

| string format | `+format={format name}` | `format` |
| size limits | `+min`, `+max` | `minLength`, `minProperties`, `minItems`, `maxLength`, `maxProperties`, `maxItems` |
| numeric limits | `+min`, `+max`, `+min:exclusive`, `+max:exclusive` | `minimum`, `maximum`, `exclusiveMinimum`, `exclusiveMaximum` |
| required fields | `+optional` | `required` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention in this table those that already exist.

Declarative validation will be performed against versioned APIs. This differs from the hand
written validation, which is evaluated against internal types.

Go IDL tags will be added to support the following declarative validation rules:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the list exhaustive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the intent, yes. Let me know if I missed anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did check, I think it has all of it, thanks.

Comment on lines +409 to +434
- When both `DeclarativeValidationEnabled` and `LegacyValidationEnabled` are
enabled, any difference in the validation errors caught by the two validators
will be logged in a special way making them easy to find and analyze, but only
the validation errors caught by legacy validation will be reported to the
client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do that without requiring the two separate feature gate?

@apelisse
Copy link
Member

apelisse commented Apr 4, 2023

Also, somewhat related: kubernetes/kubernetes#97733

- Potential causes:
- Validation versioned types introduces extra conversions.
- Why this might be OK: Requests are received as the versioned type, so it
should be feasible to avoid extra conversions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be a minor security win - currently conversion to internal involves use of unsafe operations for performance, and doing validation before conversion reduces the risk of malformed data making it to conversion.

@jpbetz jpbetz force-pushed the declarative-validation branch from 5910a1b to 0e3f9c6 Compare April 5, 2023 00:41
@jpbetz jpbetz force-pushed the declarative-validation branch from 0e3f9c6 to 6aa3b5b Compare April 5, 2023 01:01
```go
// staging/src/k8s.io/api/core/v1/types.go

// +valdiation=rule:"!self.hostNetwork || self.containers.all(c, c.containerPort.all(cp, cp.hostPort == cp.containerPort))"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// +valdiation=rule:"!self.hostNetwork || self.containers.all(c, c.containerPort.all(cp, cp.hostPort == cp.containerPort))"
// +validation=rule:"!self.hostNetwork || self.containers.all(c, c.containerPort.all(cp, cp.hostPort == cp.containerPort))"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a requirement, but I would point/copy the code that this replace to give both an idea of the order of magnitude code that can possibly remove, and how different it is.

}
```

In this example, both `+valdiation`, `+minimum` and `+maximum` are IDL tags.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this example, both `+valdiation`, `+minimum` and `+maximum` are IDL tags.
In this example, both `+validation`, `+minimum` and `+maximum` are IDL tags.

😂

[prototype](https://github.com/jpbetz/kubernetes/blob/cel-for-native/staging/src/k8s.io/sample-apiserver/pkg/apis/wardle/validation/README.md)
to get hands on experience with this proposed enhancement.

### Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, v2 is deprecated, I wouldn't care so much about it. We don't plan on using it for any of the new usage. explain uses v3, kube-verify uses v3, everything should use v3.

Declarative validation will be performed against versioned APIs. This differs from the hand
written validation, which is evaluated against internal types.

Go IDL tags will be added to support the following declarative validation rules:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did check, I think it has all of it, thanks.

| Type of valiation | Go IDL tag | OpenAPI validation field |
| ---------------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------------------------- |
| string format | `+format={format name}` | `format` |
| size limits | `+min{Length,Properties,Items}`, `+max{Length,Properties,Items}` | `min{Length,Properties,Items}`, `max{Length,Properties,Items}` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mapping directly between the IDL and the OpenAPI is the least confusing for people familiar with openapi. For people who are not, the difference between these can be hard to get initially, we should definitely have some linting if maxLength is used on non-string, maxProperties is used on non-maps, and maxItems is used on non-lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Also, Kubebuilder has set significant precedent here, and Kuberbuilder uses a direct mapping (it's prefixed with "kubebuilder:validations:", but it otherwise direct).

We will need to extend our CEL libraries to make it possible to migrate all the
validation rules that exist in the Kubernetes API today.

- `isFormat() <bool>` and `validateFormat() <list<string>>` will be added to allow formats to be checked in CEL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, is this a CEL rule or is this something that goes in the format part of the openapi, I'm a little confused? It looks like it's not since you have the formats section right below, but still confused :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to almost all format validation to be done just with OpenAPI.

ObjectMeta.name has a problem. We have a single shared type for ObjectMeta, but different types have different format requirements for the name (and generateName), so we have to cope with that somehow.

I probably need to explain my thinking in the KEP more, but my current thinking is that we could do something like:

struct Pod type {

  // +objectNameFormat="dns1123subdomain"
  ObjectMeta `json:...`
}

And have this generate openAPI like:

  "io.k8s.api.core.v1.Pod": {
        "properties": {
          "metadata": {
            "allOf": [
              {
                "$ref": "#/components/schemas/io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta"
              }
            ],
            "x-kubernetes-validations": [
              {"rule": "!has(self.name) || self.name.isFormat('dsn1123subbdomain')"},
              {"rule": "!has(self.generateName) || self.generateName.replace('_$', '_a').isFormat('dsn1123subbdomain')"},
            ]

If we could avoid this and instead put "format" directly on name, that would be amazing, but I don't know how.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexzielenski had the idea of using a second schema in allOf to validate the name directly (no CEL needed).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a neat idea.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat related to that, if we could validate whether namespace can or can not be present, that'd be nice. Also, I've been asked a few times how to know if a kind is namespaced or not, can we re-use that technic to detect it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this sounds good to me. Today, CRD resources declare scope, but the OpenAPI does not. A fix that can be applied to both CRDs and native types would be amazing. Of the fields in a CRD that do not appear in OpenAPI (scope, names, conversion, deprecated, deprecationWarning, additionalPrinterColumns), the ones that should probably be in OpenAPI (somehow) are scope, deprecated and deprecationWarning.

wrappers have already been benchmarked and shown to have low overhead.
- Risk: Migration introduces breaking change to API validation.
- Mitigations: See above migration plan, which includes numerous cross checks
to prevent mistakes from slipping through.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today validation for all the versions are done on internal, we might have discrepencies between versions now (since we'll require N validation instead of 1). We have to be sure that all the validations do the exact same thing.

@jpbetz
Copy link
Contributor Author

jpbetz commented Aug 8, 2023

/close
We're planning to do something along these lines. @alexzielenski will open a replacement KEP

@k8s-ci-robot
Copy link
Contributor

@jpbetz: Closed this PR.

In response to this:

/close
We're planning to do something along these lines. @alexzielenski will open a replacement KEP

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants