Add WasmPlugin API #1940

dgn · 2021-04-06T14:41:45Z

Implementing Design Doc: Wasm Extensions API

istio-policy-bot · 2021-04-06T14:41:47Z

😊 Welcome @dgn! This is either your first contribution to the Istio api repo, or it's been
awhile since you've been here.

You can learn more about the Istio working groups, code of conduct, and contributing guidelines
by referring to Contributing to Istio.

Thanks for contributing!

Courtesy of your friendly welcome wagon.

istio-testing · 2021-04-06T14:41:49Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

dgn · 2021-04-06T14:42:20Z

/test all

dgn · 2021-04-06T14:47:35Z

/test all

linsun · 2021-04-09T20:52:15Z

extensions/v1alpha1/wasm.proto

+  // workloads in the same namespace. If the `WasmExtension` is
+  // present in the config root namespace, it will be applied to all
+  // applicable workloads in any namespace.
+  istio.type.v1beta1.WorkloadSelector workload_selector = 5;


The way the proposed API is only designed for local workloads correct? What is the thought process to selectively select the workload for a or two other clusters?

You may be thinking we could leverage the MCS expose + import service API but I don't think that makes sense as I could be deploying Wasm extensions to workloads that are not exposed outside of the cluster.

do we have such cluster/mesh selectors anywhere else? I would imagine you would label them differently, that would certainly work with this API

@linsun what makes you think its local workloads? Everything else selects across all clusters - since the config only lives in one cluster?

So the reason I am asking this question is we have a similar Solo API called WasmDeployment and that API allows users to specify which cluster this WasmDeployment is for. So I'm trying to understand how this could be mapped to Istio API.

IIUC, @dgn, in this API, the workload selector is only for the local cluster in multi primary user case so basically Istio selects the matching workload based on the selector and apply the Wasm extension to the workload. This means if I have reviews in cluster 1 and cluster 2, and if I want this Wasm extension to be applied to reviews in both of these clusters, I would deploy a Wasm resource to both clusters

@howardjohn I think you are referring to primary-remote scenario, which I agree the Istio config is applied to all clusters.

In all Istio APIs currently, which I would assume this follows suite, workload selectors apply to all clusters. If we were to change it I would want to change it to all APIs which is a major undertaking - but this one should be consistent?

@linsun we do not have any other api that is cluster specific today. This api uses an existing istio.type.v1beta1.WorkloadSelector message. If you want to propose cluster awareness (or location, or anything other than labels) you should do it as a separate proposal.

@mandarjog I am not proposing adding cluster to this API yet. I am trying to understand the semantics of this API... wanted to see if we could translate Solo's Wasmdeployment resource to this new Wasm resource.

Just mentioning that the semantics here are not just for this API, but for all of Istio. We need to be consistent in our configuration application across the project. So maybe uplevel the conversation out of this PR and into a discussion in TOC or something if we want to talk about how config gets applied to particular endpoints and how to take cluster etc. into account.

Agreed, it is across all APIs. The question came up specifically here is because the comments were not clear.

I can try create some doc for this. If anyone else is interested, let me know!

Does k8s intend to standardize a synthetic cluster label just like they have for namespace IIRC

Does anyone know more about this synthetic cluster label? that is interesting. @howardjohn @smawson @nmittler

extensions/v1alpha1/wasm.proto

hzxuzhonghu · 2021-04-13T12:04:29Z

How could i apply this to a specific service port? inbound and outbound?

dgn · 2021-04-13T12:53:55Z

How could i apply this to a specific service port? inbound and outbound?

Very good question. In my current PoC, it will be applied to all inbound HTTP filter chains of the service ports. We might want to limit it to specific ports, but I considered this an extension we can add later.

I don't think we should apply to outbound listeners as it makes the system quite complex to reason about (different behaviour depending on which service is sending the request)

extensions/v1alpha1/wasm.proto

howardjohn · 2021-04-13T14:48:14Z

extensions/v1alpha1/wasm.proto

+  // <namespace>/<name>. Names that do not contain forward-slashes are
+  // interpreted as the names of secrets in the same namespace as the
+  // `WasmExtension`.
+  repeated string image_pull_secrets = 4;


How does this actually work? How does the proxy get the secret to pull the image?

in the current design, the agent is responsible for fetching and unpacking the OCI images, so the agent would have to fetch the secrets using client-go prior to that

I do not think this is viable. Importing client-go into the agent adds major overhead (ie invert this: https://mobile.twitter.com/karlstoney/status/1367134238613663744). Additionally, this would require giving ~every pod RBAC permissions for secrets. The load from every single pod in the cluster opening k8s watches is also likely to kill and non-trivial cluster.

I do not think this is viable. Importing client-go into the agent adds major overhead (ie invert this: https://mobile.twitter.com/karlstoney/status/1367134238613663744).

that would be an argument for moving the pulling of images to a central component, either istiod itself or something else. I don't think it means the API is unimplementable.

Additionally, this would require giving ~every pod RBAC permissions for secrets. The load from every single pod in the cluster opening k8s watches is also likely to kill and non-trivial cluster.

All pods have access to secrets in their namespace by default. For secrets outside the pod namespace, yes, users would have to make sure they have access. I would also argue that watching isn't necessary, as a) pull secrets rarely change and b) we only need it when pulling the image which, depending on pullPolicy, doesn't necessarily happen often. If we get a 403 back, we can still go and re-fetch the Secret

I don't think it means the API is unimplementable.

I agree, but I would not feel comfortable approving an API that doesn't also have a planned implementation?

All pods have access to secrets in their namespace by default.

For mounting yes, but not for reading through the API, right? Since this is dynamic, they would need to read through the API.

Yes, just compiling in the library adds to the RAM in the tweet, and binary size.

It is also fairly incompatible with VMs.

All pods have access to secrets in their namespace by default.

For mounting yes, but not for reading through the API, right? Since this is dynamic, they would need to read through the API.

That's right, nevermind. I had assumed that this was the case by default. Makes sense that it isn't though. So yes, a user would have to guarantee that all pods that need to use an Extension have access to the pullSecret, should a pullSecret be required. It's not a blocker for me personally, but I understand the concern. Pulling from a central component would alleviate this

For istio-agent based fetching, we can make Wasm API only referencing local mounted secrets. That would remove concern of client-go and incompatibility with VMs. Although user would need a way to indicate secret mount, which I original thought ExtensionProvider would be a great fit for this.

Maybe we can separate concerns for accessing the extension provider from the use of the extension? E.g. add a Wasm extension provider with pull secrets that is globally defined in MeshConfig, and reference the Wasm provider from this API. We could support both inlined and referenced Wasm extension provider with the intent to allow using secrets only from the referenced provider.

extensions/v1alpha1/wasm.proto

linsun · 2021-04-13T14:49:53Z

Another q: If I have multiple Wasm images, do I need to create one Wasm resource for each Wasm image/Wasm filter for a given workload?

extensions/v1alpha1/wasm.proto

smawson · 2021-04-13T17:05:57Z

extensions/v1alpha1/wasm.proto

+  // workloads in the same namespace. If the `WasmExtension` is
+  // present in the config root namespace, it will be applied to all
+  // applicable workloads in any namespace.
+  istio.type.v1beta1.WorkloadSelector workload_selector = 5;


Just mentioning that the semantics here are not just for this API, but for all of Istio. We need to be consistent in our configuration application across the project. So maybe uplevel the conversation out of this PR and into a discussion in TOC or something if we want to talk about how config gets applied to particular endpoints and how to take cluster etc. into account.

smawson · 2021-04-13T17:09:33Z

extensions/v1alpha1/wasm.proto

+  istio.type.v1beta1.WorkloadSelector workload_selector = 5;
+
+  // Determines where in the filter chain this `WasmExtension` is to be injected.
+  istio.networking.v1alpha3.EnvoyFilter.Patch.FilterClass phase = 6;


Let's not take a dependency on EnvoyFilter and pull that enum out into its own thing that we can define here.

I think the values/semantics are actually pretty good (which phase to run this was extension during) but this shouldn't depend on EnvoyFilter proto.

Also do we need an option to run an extension before the istio authentication filters?

enum likely also needs refinement @mandarjog

@smawson @louiscryan we will relocate the FilterClass and amend enums in a separate PR after some more discussion here.

extensions/v1alpha1/wasm.proto

bianpengyuan · 2021-04-13T18:52:01Z

I don't think we should apply to outbound listeners as it makes the system quite complex to reason about (different behaviour depending on which service is sending the request)

@dgn I disagree. Manipulate request at outbound (including gateway) and refresh routing accordingly are probably one of the most common use cases that user would want with Wasm extensions. I think for now, at outbound, we can make it only support injecting filters at listener level. Envoy Wasm extension does not yet support per route config anyway.

mandarjog · 2021-07-08T18:51:53Z

@dgn want to make sure that it is our collective intention to only use Wasm api and no longer need EnvoyFilter api to manipulate all wasm use cases.

wasm api will not cover aspects of EnvoyFilter api that are unrelated to wasm.

@mdhume we would love if you can come up with a proposal for migration from EF to the new API since you are running things in prod.

brian-avery · 2021-07-13T02:58:15Z

extensions/v1alpha1/wasm.proto

+// +cue-gen:WasmPlugin:printerColumn:name=Age,type=date,JSONPath=.metadata.creationTimestamp,description="CreationTimestamp is a timestamp
+// representing the server time when this object was created. It is not guaranteed to be set in happens-before order across separate operations.
+// Clients may not set this value. It is represented in RFC3339 form and is in UTC.
+// Populated by the system. Read-only. Null for lists. More info: https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata"


https://git.k8s.io/community/contributors/devel/api-conventions.md#metadata"

is actually a dead link and should point to

https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata

brian-avery · 2021-07-13T03:12:24Z

This is looking really good to me.

linsun · 2021-07-14T13:10:52Z

extensions/v1alpha1/wasm.proto

+  // If `priority` is not set, or two `WasmPlugins` exist with the same
+  // value, the ordering will be deterministically derived from name and
+  // namespace of the `WasmPlugins`.
+  google.protobuf.Int64Value priority = 10;


What is the default priority value? In your scenario above, what if I specify one Wasm resource without priority and one with a priority of 1000 (or 10) within the same phase, how do I know which is prioritized first?

Defaults to 0, I added a sentence to the comment. Thank you!

linsun · 2021-07-14T13:12:36Z

LGTM overall! would like the following to be addressed:

use workloadSelector to be consistent with other istio resources?
clarify priority default value (when it is not specified) and how it compares with a priority value supplied by user?

mandarjog · 2021-07-14T14:38:35Z

@linsun the new APIs (telemetry) actually uses selector so if we change this, then we need to change the other too. selector is consistent with other k8s api.

@dgn we have used priority in ascending order in EnvoyFilter. I think we should do the same here. The default value is 0.

dgn · 2021-07-19T06:58:45Z

@dgn we have used priority in ascending order in EnvoyFilter. I think we should do the same here. The default value is 0.

So higher priority means later in the filter chain? That sounds confusing

linsun · 2021-07-19T15:00:33Z

@mandarjog thanks, SG!

I agree with @dgn - I had thought priority higher means it is first/earlier in the filter chain. Let's clarify default 0 in the comment.

mandarjog · 2021-07-19T15:48:36Z

@dgn let’s clarify the comment and merge it!

mandarjog · 2021-07-19T15:49:29Z

@linsun please lgtm it since you have reviewed it, John and Brian have already indicated so.

mandarjog · 2021-07-19T15:50:13Z

adding label until we get 3 lgtms.

linsun

LGTM - pending the priority comment clarification of default.

smawson · 2021-07-19T16:26:30Z

extensions/v1alpha1/wasm.proto

+  // Criteria used to select the specific set of pods/VMs on which
+  // this plugin configuration should be applied. If omitted, this
+  // configuration will be applied to all workload instances in the same
+  // namespace.  If omitted, the `WasmPlugin` will be applied to all


Duplicate comment

extensions/v1alpha1/wasm.proto

smawson · 2021-07-19T17:20:35Z

extensions/v1alpha1/wasm.proto

+  // the `url` field contains the URL of an OCI image that is referenced
+  // by tag, the reference will be replaced by the value of the `sha256`
+  // field (using the `@sha256:` notation).
+  optional string sha256 = 3;


This is definitely more confusing than is necessary.

For non-OCI usage, I'm assuming this is treated as a checksum, and used to verify that what you fetch is what you wanted to fetch. For the OCI usage it is used to fetch a specific version right? Not just as a checksum.

My assumptions would be for OCI:

If @sha256 is set in the URL, we raise an error if sha256 is set in the field to something different

If sha is set (either in URL or field), we also verify that the downloaded image has the correct sha.

If sha field is set, but not in the URL, the image is still downloaded by tag and sha verified after download (not sure about this one).

It is confusing to have fields used for both fetch and verify, better to make it more explicit and have the sha field always used just for verification.

smawson · 2021-07-19T17:27:54Z

extensions/v1alpha1/wasm.proto

+  // Insert plugin before Istio authentication filters.      
+  AUTHN = 1;
+
+  // Insert plugin before Istio authorization filters.


Is this also after Istio authentication filters? Maybe specify that if so to be explicit (same with STATS being after authorization filters).

Yes, it's also after the istio auth filters. I made it more explicit, also in the comment for STATS below

smawson · 2021-07-19T17:29:34Z

extensions/v1alpha1/wasm.proto

+// mirroring K8s behaviour.
+//
+// buf:lint:ignore ENUM_VALUE_UPPER_SNAKE_CASE
+enum PullPolicy {


I think we need an UNSPECIFIED = 0 here.

Otherwise I think even explicitly setting it to IfNotPresent would get overridden if latest is in the tag, since we couldn't distinguish between default and explicit set to IfNotPresent.

Good point. I'll add UNSPECIFIED so we can match k8s behaviour

- Clarify default for priority - Add UNSPECIFIED_POLICY value - Change sha256 behaviour

smawson · 2021-07-22T16:24:30Z

I'm happy with this if you want to remove the hold Mandar.

mandarjog · 2021-07-27T21:04:04Z

/cherry-pick release-1.10

istio-testing · 2021-07-27T21:04:05Z

@mandarjog: once the present PR merges, I will cherry-pick it on top of release-1.10 in a new PR and assign it to you.

In response to this:

/cherry-pick release-1.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

istio-testing · 2021-07-27T21:06:42Z

@mandarjog: new pull request created: #2067

In response to this:

/cherry-pick release-1.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

mandarjog · 2021-07-27T21:23:14Z

/cherry-pick release-1.11

istio-testing · 2021-07-27T21:23:51Z

@mandarjog: new pull request created: #2068

In response to this:

/cherry-pick release-1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-cla bot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Apr 6, 2021

istio-testing added the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Apr 6, 2021

istio-testing added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 6, 2021

istio-testing added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 6, 2021

dgn marked this pull request as ready for review April 7, 2021 07:29

dgn requested review from dcberg, duderino, linsun, louiscryan, nrjpoddar and smawson as code owners April 7, 2021 07:29

istio-testing removed the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Apr 7, 2021

asraa mentioned this pull request Apr 7, 2021

WASM module signature verification istio/istio#32002

Open

linsun reviewed Apr 9, 2021

View reviewed changes

hzxuzhonghu reviewed Apr 13, 2021

View reviewed changes

extensions/v1alpha1/wasm.proto Outdated Show resolved Hide resolved

hzxuzhonghu reviewed Apr 13, 2021

View reviewed changes

extensions/v1alpha1/wasm.proto Outdated Show resolved Hide resolved

dgn force-pushed the wasm-api branch from 7ac22e3 to c8a4c47 Compare April 13, 2021 12:34

linsun reviewed Apr 13, 2021

View reviewed changes

extensions/v1alpha1/wasm.proto Outdated Show resolved Hide resolved

howardjohn reviewed Apr 13, 2021

View reviewed changes

smawson reviewed Apr 13, 2021

View reviewed changes

louiscryan reviewed Apr 13, 2021

View reviewed changes

extensions/v1alpha1/wasm.proto Outdated Show resolved Hide resolved

brian-avery reviewed Jul 13, 2021

View reviewed changes

linsun reviewed Jul 14, 2021

View reviewed changes

mandarjog added the do-not-merge/hold Block automatic merging of a PR. label Jul 19, 2021

howardjohn approved these changes Jul 19, 2021

View reviewed changes

linsun approved these changes Jul 19, 2021

View reviewed changes

smawson reviewed Jul 19, 2021

View reviewed changes

brian-avery approved these changes Jul 20, 2021

View reviewed changes

Review comments

6a8013d

- Clarify default for priority - Add UNSPECIFIED_POLICY value - Change sha256 behaviour

smawson approved these changes Jul 22, 2021

View reviewed changes

dgn changed the title ~~Add WasmExtension API~~ Add WasmPlugin API Jul 27, 2021

mandarjog removed the do-not-merge/hold Block automatic merging of a PR. label Jul 27, 2021

istio-testing merged commit 5e10e5c into istio:master Jul 27, 2021

istio-testing mentioned this pull request Jul 27, 2021

[release-1.10] Add WasmPlugin API #2067

Closed

istio-testing mentioned this pull request Jul 27, 2021

[release-1.11] Add WasmPlugin API #2068

Closed

dgn mentioned this pull request Oct 14, 2021

Wasm API tracking issue istio/istio#35607

Closed

10 tasks

Add WasmPlugin API #1940

Add WasmPlugin API #1940

Conversation

dgn commented Apr 6, 2021 • edited Loading

istio-policy-bot commented Apr 6, 2021

istio-testing commented Apr 6, 2021

dgn commented Apr 6, 2021

dgn commented Apr 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hzxuzhonghu commented Apr 13, 2021

dgn commented Apr 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgn Apr 19, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

linsun commented Apr 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bianpengyuan commented Apr 13, 2021

mandarjog commented Jul 8, 2021

Choose a reason for hiding this comment

brian-avery commented Jul 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

linsun commented Jul 14, 2021

mandarjog commented Jul 14, 2021

dgn commented Jul 19, 2021

linsun commented Jul 19, 2021

mandarjog commented Jul 19, 2021

mandarjog commented Jul 19, 2021

mandarjog commented Jul 19, 2021

linsun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smawson commented Jul 22, 2021

mandarjog commented Jul 27, 2021

istio-testing commented Jul 27, 2021

istio-testing commented Jul 27, 2021

mandarjog commented Jul 27, 2021

istio-testing commented Jul 27, 2021

dgn commented Apr 6, 2021 •

edited

Loading

dgn Apr 19, 2021 •

edited

Loading