-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Envoy update #18
Envoy update #18
Conversation
Should we just move this out of example/poc, now it's more "official"? EDIT: nvm, I think we can do this as a follow up, in which we should release the new ext proc images. |
+1. |
I agree with this being top level. I was thinking about that when I was writing this and was waiting for feedback from others. What do we think about just a duplication? Similar to how we have done with the ext-proc image refactor. |
Sounds good to me. There are a couple of things: 1. We should build new ext proc from pkg/ext-proc instead of the poc one. 2. The "pods" (still need "podIPs") and "ensureFairness" flags are removed, need to update the yaml accordingly. Feel free to do this in a follow up though. |
# Specifically the field `original_dst_lb_config` allows us to enable | ||
# `use_http_header` and `http_header_name`. | ||
# Source: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto | ||
- type: "type.googleapis.com/envoy.config.cluster.v3.Cluster" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
…fests to the top level Ext-Proc implementation
/label tide/merge-method-squash |
A custom GatewayClass `llm-gateway` which is configured with the llm routing ext proc will be installed into the `llm-gateway` namespace. It's configured to listen on port 8081 for traffic through ext-proc (in addition to the default 8080), see the `EnvoyProxy` configuration in `installation.yaml`. When you create Gateways, make sure the `llm-gateway` GatewayClass is used. | ||
1. **Update Envoy Gateway Config to enable Patch Policy** | ||
|
||
Our custom LLM Gateway ext-proc is patched into the existing envoy gateway via `EnvoyPatchPolicy`. To enable this feature, we must extend the Envoy Gateway config map. To do this, simply run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have instructions for deploying the Envoy gateway controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! The quickstart on line 10 points them there
- name: instance-gateway-ext-proc | ||
image: ghcr.io/tomatillo-and-multiverse/ext-proc:demo | ||
args: | ||
#TODO: specify label selector and dynamically update pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should actually pass the name of the LLMServerPool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, totally agree, I plan to have ext-proc pull the selection from the config on the LSP so there is always a single source of truth (once we have that set up)
@@ -0,0 +1,26 @@ | |||
apiVersion: v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this and the other yamls under pkg/manifests seem duplicate to the ones in the examples/poc folder, do we need both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're absolutely right. I think eventually we just want to delete the poc directory. But while we are still using forks of vLLM to operate. I think keeping the poc folder and a way to set it up there kind of implies the WIP nature of our project.
Or we can always fast-follow this PR with removing the poc dir
LMKWYT!
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, kfswain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This modifies our Ext-Proc integration to use Envoy best practices, via
EnvoyExtensionPolicy
andEnvoyPatchPolicy
.Here we create the ext-proc integration in EnvoyExtensionPolicy. We then create an
ORIGINAL_DST
cluster in envoy, and create a listener to attach our ext-proc deployment to this newly created cluster.Also update the readme to reflect these changes.
fixes #10