Skip to content

Latest commit

 

History

History
33 lines (17 loc) · 1.86 KB

README.md

File metadata and controls

33 lines (17 loc) · 1.86 KB

Gateway API Inference Extension

The Gateway API Inference Extension came out of wg-serving and is sponsored by SIG Network. This repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers of the extension.

This extension is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

Getting Started

Follow this README to get the inference-extension up and running on your cluster!

End-to-End Tests

Follow this README to learn more about running the inference-extension end-to-end test suite on your cluster.

Website

Detailed documentation is available on our website: https://gateway-api-inference-extension.sigs.k8s.io/

Contributing

Our community meeting is weekly at Thursday 10AM PDT (Zoom, Meeting Notes).

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, follow the dev guide to start contributing!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.