-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add apply-time-mutation feature #400
Add apply-time-mutation feature #400
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi @karlkfi. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold (still WIP) |
cee78a1
to
5aca3cf
Compare
/check-cla |
/unhold |
|
||
// Interface decouples apply-time-mutation | ||
// from the concrete structs used for applying. | ||
type Interface interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider a different name, like "Mutator"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Golint complains about stutter on exported functions and types:
- https://github.com/golang/lint/blob/master/lint.go#L483
- https://github.com/golang/lint/blob/master/lint.go#L938
EX: sort.Interface instead of sort.Sort: https://pkg.go.dev/sort#Interface
I think this used to be in Effective Go, but now that I'm looking for it, I can't find it in the latest copy.
It was in older versions tho:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda like the name here since the package is already called mutator
.
newValue = sourceValue | ||
} else { | ||
// token specified, substitute token for source field value in target field value | ||
targetValueString, ok := targetValue.(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt about doing some type assertions for templated destinations,?
they should probably all be strings... or just let it fall through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The impl here errors if the token is specified and the target field isn't a string.
Having token be optional allows for maps, lists, ints, floats, and bools to be set wholesale, without string substitution.
From the description:
I'm curios to learn more about these options. Maybe we can improve (one of) them instead of adding the yq dependency? p.s. This feature is something I've built before in Smith. See the docs. I didn't need a jq/yq library, just used unstructured and helpers (I'm sure it's less sophisticated, but it worked ok). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the direction and structure here.
|
||
// Interface decouples apply-time-mutation | ||
// from the concrete structs used for applying. | ||
type Interface interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda like the name here since the package is already called mutator
.
@ash2k I spent most of the week trying to fork and modify jsonpath impls to add setter capability and failed miserably. It may be possible, but it's beyond my ability to do quickly. This apply-time-mutation is a big blocker for using infrastructure operators, like Config Connector, which need a little hand-holding to orchestrate ordering. And it should work well as a fallback for orchestrating multiple operators that haven't been designed to integrate. So I didn't want to block this on writing a new jsonpath impl. My initial implementation here did use [Set]NestedFields from apimachinery, but I realized that while it worked for basic use cases, it wouldn't work for anything related to Pod containers, because it can't index arrays. So I needed something more powerful. yq was my 2nd choice, because everyone loves jq, and the new yq v4 uses a very similar (tho poorly documented) dialect. That said, I also found https://github.com/spyzhov/ajson yesterday and it does support jsonpath setters. But it's less popular and less well maintained than yq. Also, JSONPath has as many variations as implementations. On the plus side, it's syntax is meticulously tested (https://cburgmer.github.io/json-path-comparison/). I'm looking through your field references doc and I don't see any path syntax. Do you have time to jump on a Meet/Zoom and explain what you mean? |
sourceRef := sub.SourceRef | ||
|
||
// lookup source resource from cache or cluster | ||
sourceObj, err := atm.getObject(sourceRef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed it, but I don't see a check for object status in getObject or here, is this dependent fully on the dependsOn implementation to assert readiness prior to fetching? will that make it more challenging in the future to move to supporting mutation references to objects outside the apply set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The depends-on impl added a graph.SortObjs
that handles ordering of apply stages with wait tasks in between. I extended the SortObjs impl to inject the source resources from apply-time-mutation into the graph:
https://github.com/karlkfi/cli-utils/blob/karl-apply-time-mutation/pkg/object/graph/sort.go#L33
So by the time the mutator is executed in the apply task, the waits should have already happened.
There might be some edge cases with resources not in the inventory. So I'll need to write some tests that exercise both pieces together. But I'm not sure where to put them yet in cli-utils.
Alright, this is ready for serious review. I've added tests for ApplyTimeMutaiton that I'm satisfied with. I haven't added any Mutate tests to the ApplyTask, but there aren't any for Filter either. So we'll probably need to backfill those later. It might require abstracting/refactoring the ApplyTask some more. The biggest open question is whether we're satisfied with If There's also an option to support JSON Path later, in addition to yq, requiring JSON Path expressions to start with |
I've gone back and reverted a bunch of the minor refactors that are in other PRs or aren't critical to the functionality. |
2dc1652
to
d47f057
Compare
Alright, the general consensus talking to stakeholders seems to be that everyone expects JSONPath, even if no one can really justify why JSONPath is better than yq. So following the principal of least surprise, I've refactored this to use https://github.com/spyzhov/ajson as the expression language, because that's the only existing Go JSONPath lib that supports both retrieval and mutation. The one significant downside of this approach is that ajson does not auto-generate intermediate arrays/maps when setting a value. So the value needs to exist in the input object before it can be mutated. I've filed a feature request for that: spyzhov/ajson#35 |
One benefit of using ajson over yq is that ajson doesn't use any logging library (except in the main.go, which we aren't using). So this means I can drop the logtuil for translation. |
This is ready to review again. The biggest outstanding question, IMO, is the ResourceReference syntax.
Prior Art:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I'll give others a chance to review it too.
I think it is ok to use the simple cache for now, but we should eventually think about how we can combine leverage the resources we already fetch from the cluster as part of determining reconcile status. But we might want to look at ways to improve that in other ways too.
// Group is generally preferred, to avoid needing to update the version in lock | ||
// step with the referenced resource. | ||
// If neither is provided, the empty group is used. | ||
type ResourceReference struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that keeping the ObjMetadata as an internal type and define a new one here makes sense. I'm a bit skeptical about supporting apiVersion here, as we are otherwise trying to avoid using the version. If I specify a version here, would we then also require that we fetch the resource from the apiserver with that exact version?
I think we have used just Group
pretty consistently rather than apiGroup
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good and it is ready to merge. I disagree on the two types which are basically the same thing (ObjMetadata
and ResourceReference
), but I will not block the PR on that. The only thing missing appears to be an end-to-end test.
As far as the version in the ResourceReference
, I think it will not break anything as long as there is only one ResourceReference
per mutation. But if we ever have to identify if two ResourceReference
point to the same resource, then we must not use version (the Equal
implements this correctly). I also think the versioned fetch may be a more fragile approach, since fetching with a version has a higher likelihood of returning an error. A version would only be required if the referenced field in the resource is dependent on the version (i.e. the field only exists in a particular version of the resource).
} | ||
|
||
// lookup resource using group api version, if specified | ||
sourceGvk := ref.GroupVersionKind() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the field referenced in the resource depends on the version (e.g. the field is only present for a particular version of the resource), then the version is required. But adding the version makes the field substitution more fragile, since the API Server may return an error if there is no translation for that resource for that particular version. Allowing the API Server to choose the returned version may be more robust. Let's go with this for now--I'm sure we'll run into subtle corner cases as we run it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding the version makes the field substitution more fragile
Yes. This is intentional. I want to allow apiVersion without parsing, but you're right that it's more fragile. If users run into problems, they have the option to switch to using group instead.
Let's go with this for now--I'm sure we'll run into subtle corner cases as we run it.
If this becomes a problem users complain about, I'm ok with re-evaluating later and removing apiVersion, or accepting apiVersion and just discarding the version (which would be less disruptive, but also less intuitive).
|
||
// Equal returns true if the ResourceReference sets are eqivelent, ignoring version. | ||
// Fulfills Equal interface from github.com/google/go-cmp | ||
func (r ResourceReference) Equal(b ResourceReference) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Equal
will mostly work when ignoring version. I think this will probably work for what it is being used for. The only true way to compare resources is by comparing their API Server assigned UID
. For example, the following corner case will incorrectly say the two resources are not the same when they are:
Resource 1:
Group/Version: apiextensions.k8s.io/v1beta1
Kind: Deployment
Namespace: foo
,
Name: bar
Resource 2 (same resource):
Group/Version: apps/v1
Kind: Deployment
Namespace: foo
,
Name: bar
Most of these corner cases have been deprecated by version 1.16 and later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only two uses of Equal right now are for detecting self-references and test comparisons. In these cases, we don't really care about the api version.
But you're right that group mismatch would break the self-reference detection if the group changed between versions. This is an interesting edge case. I suspect that it would cause the resource to be treated like two separate resources, with one depending on the other, and both being added to the dependency graph. The graph doesn't look like it checks for duplicates. I'm not even really sure how we would do client-side deduplication when the group differs like this.
// Group is generally preferred, to avoid needing to update the version in lock | ||
// step with the referenced resource. | ||
// If neither is provided, the empty group is used. | ||
type ResourceReference struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should stick with ObjMetadata
, but I will not block on this. Both of these types should not be part of the API, and they should not be exported. The API is the annotation--not these two reference types.
3eb2202
to
2ccd195
Compare
Alright, I added new e2e tests. I don't have any interesting use cases that use the default K8s resources, so it's a bit of a toy case. But I wasn't sure if it would be a good idea to copy in the Config Connector CRDs. I also made the jsonpath lib a little more useful by adding support for multi-get and multi-set, but I kept the apply-time-mutation requiring exactly one match. We can add fan-out and/or fan-in later if users ask for it. |
Do I need to swash the commits myself? Probably don't want all the revert history in there. |
I usually squash them myself, although I think there is a github merge option to squash them too. We do not want the merge history. Almost always we merge with one commit. |
- Detect `config.kubernetes.io/apply-time-mutation` annotation - Parse annotation string value as YAML - Treat source resource as a dependency - Before applying, apply specified substitutions - Each mutation may include one or more substitution - Substitutions may optionally replace a token in the existing string value, or replace the whole value (e.g. non-strings) - Source and target fields are specified with JSONPath expressions - Using github.com/spyzhov/ajson because it supports mutation, and not just retrieval - ApplyTimeMutator uses an in-memory ResourceCache to reduce GETs
2ccd195
to
02c0fb7
Compare
squashed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: karlkfi, seans3 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
New feature: apply-time-mutation
Mutation algorithm:
config.kubernetes.io/apply-time-mutation
annotationMutation details:
Expression Language:
Kubernetes is using JSONPath elsewhere, but the existing implementations do not support mutation, just retrieval.
sigs.k8s.io/cli-utils/pkg/jsonpath
library with Get/Set onmap[string]interface{}
.Resource Caching:
Other included changes: