Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Nginx - k8s manifest in CodeTrans #610

Merged
merged 8 commits into from
Aug 29, 2024

Conversation

letonghan
Copy link
Collaborator

Description

Add k8s manifest for nginx in CodeTrans

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

None

Tests

Local tested

@letonghan letonghan requested a review from Spycsh as a code owner August 16, 2024 09:42
@chensuyue chensuyue added this to the v0.9 milestone Aug 17, 2024
@chensuyue chensuyue removed this from the v0.9 milestone Aug 19, 2024
@chensuyue chensuyue added this to the v1.0 milestone Aug 19, 2024
@letonghan
Copy link
Collaborator Author

Merge after v0.9 release.

@chensuyue chensuyue merged commit 6a679ba into opea-project:main Aug 29, 2024
13 checks passed
xuechendi pushed a commit to xuechendi/GenAIExamples that referenced this pull request Sep 9, 2024
dmsuehir pushed a commit to dmsuehir/GenAIExamples that referenced this pull request Sep 11, 2024
@letonghan letonghan deleted the nginx/k8s branch September 23, 2024 03:15
JakubLedworowski pushed a commit to JakubLedworowski/GenAIExamples that referenced this pull request Jan 28, 2025
* Add monitoring support for the vLLM component

Signed-off-by: Eero Tamminen <[email protected]>

* Initial vLLM support for ChatQnA

For now vLLM replaces just TGI, but as it supports also embedding,
also TEI-embed/-rerank may be replaceable later on.

Signed-off-by: Eero Tamminen <[email protected]>

* Fix HPA comments in tgi/tei/tererank values files

Signed-off-by: Eero Tamminen <[email protected]>

* Add HPA scaling support for ChatQnA / vLLM

Signed-off-by: Eero Tamminen <[email protected]>

* Adapt to latest vllm changes

- Remove --eager-enforce on hpu to improve performance
- Refactor to the upstream docker entrypoint changes

Fixes issue opea-project#631.

Signed-off-by: Lianhao Lu <[email protected]>

* Clean up ChatQnA vLLM Gaudi parameters

Signed-off-by: Eero Tamminen <[email protected]>

---------

Signed-off-by: Eero Tamminen <[email protected]>
Signed-off-by: Lianhao Lu <[email protected]>
Co-authored-by: Lianhao Lu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants