Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lenient validation and merging for non-strict listType=map #234

Closed
apelisse opened this issue Jun 5, 2023 · 31 comments
Closed

Lenient validation and merging for non-strict listType=map #234

apelisse opened this issue Jun 5, 2023 · 31 comments

Comments

@apelisse
Copy link
Contributor

apelisse commented Jun 5, 2023

Right now, we have some lists that are marked as listType=map that are not strictly maps in Kubernetes. In that case, everything goes wrong in smd because the list is invalid (and can't be handled), so it's fully rejected. We can't even track the fields, which causes us to fail and return an error which is ignored by Kubernetes, and the managed fields are entirely reset.

Unfortunately, it means that objects that have such a list can't be server-side applied at all. We think it'd be simpler to merely ignore these (fairly rare) occurences and try to do the best thing possibe rather than fail.

This means:

  1. When we validate an object with such a list, we should probably not fail altogether (though we should probably reject a request to server-side apply an object like this).
  2. When tracking the fields, we should give duplicate fields to the same owner.
  3. When server-side applying without the field, we need to be careful about how the field should be removed.
  4. When we merge a list that has such a field, we need to be careful, especially if the server-side apply contains the field (should we remove both other occurences? Probably).
@apelisse
Copy link
Contributor Author

apelisse commented Jun 5, 2023

cc @thockin @jpbetz

@jpbetz
Copy link
Contributor

jpbetz commented Jun 6, 2023

Do we have the list of fields that fall into this "marked as listType=map but not validated as having unique keys" category?

@apelisse
Copy link
Contributor Author

apelisse commented Jun 6, 2023

Not really but, that's a great question:

  1. I'm pretty sure CRDs don't allow that and never will, so there's no chance people start abusing that mechanism there (IOW, actual validation matches SSA validation)
  2. Since it's only in built-ins, and because a lot of these were not designed for that initially (unfortunately), we have:
    environment variables and a few containerports.

I considered adding a new, separate tag, but that would require a KEP and make things impossible to backport. I think we need to make it clear that this is a work-around and shouldn't be used for any new list (again, only for built-ins so we have a lot more control over that).

@thockin
Copy link

thockin commented Jun 6, 2023

There are two distinct failure modes, the one we have usually run into is client-side apply vs. Service.spec.ports, where the keys are looser than validation, and SSA is the SOLUTION (the keys match validation).

The more recent one is where SSA's keys are more strict than validation. So far we've found:

  • Pod.spec.containers[].ports
  • Pod.spec.containers[].env
  • Pod.spec.containers[].envFrom

I admit that I haven't scoured the API for this case, and I don't honestly know how I would in any reasonable amount of time. :( IOW, there's ikely to be more

@jpbetz
Copy link
Contributor

jpbetz commented Jun 6, 2023

I did a scan. Here's what I found:

A second pair of eyes on this would be really helpful. I only looked at the validation code. I haven't actually tried all these cases.

@apelisse
Copy link
Contributor Author

apelisse commented Jun 6, 2023

I wouldn't be surprised if some people think this will automatically write the validation for them. For most people authoring new resources and using this, it should tbh. Maybe we should generate and have an opt-out flag?

@jpbetz
Copy link
Contributor

jpbetz commented Jun 6, 2023

Maybe we should generate and have an opt-out flag?

kubernetes/enhancements#3938 proposes that we automatically validate. An opt-out flag sounds useful.

For these 4 ports cases, should we chnage listMapKey to 'name' to match what is validated? Can SSA handle a map key change?

@apelisse
Copy link
Contributor Author

apelisse commented Jun 6, 2023

For these 4 ports cases

Which 4?

If you mean:

Container.Ports - validates unique name, listMapKey=containerPort, protocol

I don't think that's correct, name is unique but I don't know if that's good enough, name isn't required.

@jpbetz
Copy link
Contributor

jpbetz commented Jun 6, 2023

I don't think that's correct, name is unique but I don't know if that's good enough, name isn't required.

Ah, we don't have a key then, because name is the only field that validation checks when looking for duplicates.

We might need to go with atomic?

@apelisse
Copy link
Contributor Author

apelisse commented Jun 6, 2023

Actually, if we're careful we might be able to make it work with name, especially if it's omitempty, we can default it to "". Then as you mentioned, we would never have two keys with the same name.

@apelisse
Copy link
Contributor Author

apelisse commented Jun 6, 2023

If that works by the way, that might be a great way to fix a lot of these problems. For env, we might be able to make a key with name AND value. Then we can only have one such pair, I think that's fine, we can still merge the list properly.

@jpbetz
Copy link
Contributor

jpbetz commented Jun 6, 2023

If that works by the way, that might be a great way to fix a lot of these problems. For env, we might be able to make a key with name AND value. Then we can only have one such pair, I think that's fine, we can still merge the list properly.

I've love to see a test suite trying out various operations on a listType=map defined this way. For CEL in CRDs we defined map key to be the CEL expression and it works just fine...

@apelisse
Copy link
Contributor Author

apelisse commented Jun 6, 2023

I've love to see a test suite trying out various operations on a listType=map defined this way.

We have a lot of tests in this repo, what do you have in mind? We already have multi-key tests here, that should work fine. We can reason and see what happens (and maybe fix) when we change the keys, I think we had thought about this a while back.

@jpbetz
Copy link
Contributor

jpbetz commented Jun 6, 2023

We can reason and see what happens (and maybe fix) when we change the keys, I think we had thought about this a while back.

Good point.

Maybe open a PR that tries to fix the broken cases and we can manually try various apply requests out?

  • Env: Use "key", "value"? This is still not exactly what we validate-- right now you could repeat A=x{$A} twice... should we go this route or just mark it as atomic?
  • Ports: Use enough fields to ensure uniqueness in each case? E.g. all the fields of ContainerPort?
  • Cases with no validation: Mark them as atomic?

ObjectMeta.OwnerReferences worries me..

@thockin
Copy link

thockin commented Jun 6, 2023

Actually, if we're careful we might be able to make it work with name, especially if it's omitempty, we can default it to "". Then as you mentioned, we would never have two keys with the same name.

For ContainerPort, the name is optional. If specified, it must be unique, but many ports may be named "".

@thockin
Copy link

thockin commented Jun 6, 2023

Ports: Use enough fields to ensure uniqueness in each case? E.g. all the fields of ContainerPort?

Sadly, we allow total duplicates today, as long as hostPort and hostIP are both unspecified. It's meaningless, but allowed.

@thockin
Copy link

thockin commented Jun 6, 2023

PodSpec.TopologySpreadConstraints - validation missing, listMapKey=whenUnsatisfiable

I see topologyKey + whenUnsatisfiable but I find validation code in validateTopologySpreadConstraints()

ServiceSpec.Ports - same as ContainerPorts (except keys are port, protocol)

This is validated in ValidateService()

    // Check for duplicate Ports, considering (protocol,port) pairs
    portsPath = specPath.Child("ports")
    ports := make(map[core.ServicePort]bool)
    for i, port := range service.Spec.Ports {
        portPath := portsPath.Index(i)
        key := core.ServicePort{Protocol: port.Protocol, Port: port.Port}
        _, found := ports[key]
        if found {
            allErrs = append(allErrs, field.Duplicate(portPath, key))
        }    
        ports[key] = true 
    }    

PodSpec.ImagePullSecretes - validation missing, patchMergeKey=name

Agree there's no validation - I think this should be atomic

PodSpec.HostAliases - validation missing, patchMergeKey=ip

Agree there's no validation - I think this should be atomic

ServiceAccount.Secrets - validation missing, patchMergeKey=name

Agree there's no validation - I think this should be atomic

ObjectMeta.OwnerReferences - validation missing, patchMergeKey=uid

Agree this is not validated unique, but it is validated non-empty. This one is hard because I can see people wanting to co-own it.

LabelSelectorRequirement.Key - validation missing, patchMergeKey=key

This field is a string - what does it even mean to have a merge key?

@apelisse
Copy link
Contributor Author

apelisse commented Jun 7, 2023

To be clear, most lists should not be atomic. Atomic should be used for things that can't be a mix of multiple things, or that are very tightly coupled, kind of like a number (wouldn't make sense to let different people change different digits), or a string (we should never let different actors change different parts of a string).

We should try really really hard to not make these lists atomic and give them a good way to be updated by multiple actors with different opinions. Here is a litmus test: if two actors (users and/or controllers) can have different opinions of a list that are not mutually exclusive, then the list shouldn't be atomic.

Can two users want different environment variables to be set? Can two users want different sets of ContainerPorts? Can they each have their secret? Can you have multiple ownerReferences, can you have multiple LabelSelector etc... None of them should be atomic.

Actually, atomic is breaking Ci/CD systems so much that I'd rather find a work-around to make things not completely break than make them atomic.

@apelisse
Copy link
Contributor Author

apelisse commented Jun 7, 2023

LabelSelectorRequirement.Key - validation missing, patchMergeKey=key

This field is a string - what does it even mean to have a merge key?

That has to be a bug 😂

@jpbetz
Copy link
Contributor

jpbetz commented Jun 7, 2023

For StorageVersion: kubernetes/kubernetes#118520

@jpbetz
Copy link
Contributor

jpbetz commented Jun 7, 2023

That has to be a bug 😂

It is. It got added by kubernetes/kubernetes#44121 but then got missed when the bad merge keys were cleaned up by kubernetes/kubernetes#50257.

Here's a fix: kubernetes/kubernetes#118522

@thockin
Copy link

thockin commented Jun 7, 2023 via email

@jpbetz
Copy link
Contributor

jpbetz commented Jun 7, 2023

Fix for container and ephemeralContainer ports: kubernetes/kubernetes#118536

EDIT: container fix will be more challenging than I originally realized. Identical container ports are allowed unless a name or hostPort is set. Problematic example (which I just verified is supported on a cluster):

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
    - containerPort: 80

@thockin
Copy link

thockin commented Jun 7, 2023

xref: kubernetes/kubernetes#118475

@jpbetz
Copy link
Contributor

jpbetz commented Jun 7, 2023

I've verified that all remaining cases (env, ports, imagePullSecrets, hostAliases, serviceAccount.Secrets, ownerReferences) allow exact duplicates. This means these none of these cases cannot be fixed solely by changing the merge keys.

This simplifies things in my mind because it means we either have to handle exact duplicates in SSA, or we have to use atomic.

It is interesting that ownerReferences is deprecating duplicates. The warning it returns for duplicates is:

Warning: .metadata.ownerReferences contains duplicate entries; API server dedups owner references in 1.20+, and may reject such requests as early as 1.24; please fix your requests; duplicate UID(s) observed: 421f3acc-5bfd-4007-b28a-4b0749c95888

ports and serviceAccount.secrets have no warnings. We should add those.

Should we also consider if some fields should have a warning that we will stop allowing exact duplicates in the future?

@thockin
Copy link

thockin commented Jun 7, 2023

I am skeptical that we can or should ever reject duplicates, but yes to warnings.

@jpbetz
Copy link
Contributor

jpbetz commented Jun 7, 2023

I am skeptical that we can or should ever reject duplicates, but yes to warnings.

SGTM. When adding the warnings for the fields that do not yet have warnings, we could also update the warning for ownerReferences to stop saying that we might start rejecting requests in 1.24 (since it's over a year ago anyway).

@rmohr
Copy link

rmohr commented Jul 20, 2023

From the discussion above it is not entirely clear on what you will consider valid on container ports on pods. Will duplicate portnumbers with different port names still be considered valid? They are useful to aggregate pods to different services for different functions without having to expose a different port (requiring app changes).

@thockin
Copy link

thockin commented Jul 25, 2023

That depends on the operation. create and replace will preserve pod-ports with the same number but different name. apply will merge them based on port number+protocol (which is how the merge-key is defined)

@apelisse
Copy link
Contributor Author

also related to kubernetes/kubernetes#113482

@apelisse
Copy link
Contributor Author

Fixed by #254

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants