Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vLLM server capabilities #260

Closed
QwertyJack opened this issue Aug 29, 2024 · 4 comments · Fixed by #263
Closed

vLLM server capabilities #260

QwertyJack opened this issue Aug 29, 2024 · 4 comments · Fixed by #263

Comments

@QwertyJack
Copy link

Thank you for your excellent work!

I've noticed that you've implemented a vLLM-based server with some modifications, but I have a few questions:

  1. Why did you omit the served-model parameter? It's quite useful for loading a model from local storage while maintaining the same name.
  2. Do you think including utility endpoints like /metrics and /health would be beneficial?

Thanks in advance.

@jeffreymeetkai
Copy link
Collaborator

Thanks for pointing this out. I've enabled --served-model-name and you should be able to load a local model while accepting requests with another name (e.g.: official model names like meetkai/functionary-small-v3.2).

For question 2, we do not find the need to implement yet as it seem to involve long lines of code to implement as seen from here, and there is also not much demand for it. We also use a monkey-patched AsyncLLMEngine here when grammar sampling is enabled. I'm not sure if monkey-patching other parts of vLLM is needed to get /health to work with this too.

@QwertyJack
Copy link
Author

I see.
Btw, I found an easy way to integrate those two endpoints. PTAL

@nskumz
Copy link

nskumz commented Sep 24, 2024

@QwertyJack , could you please share the process to get the vLLM endpoint /metrics from this model

@QwertyJack
Copy link
Author

@QwertyJack , could you please share the process to get the vLLM endpoint /metrics from this model

After #263 is merged, vLLM style endpoints GET /metrics and GET /health will be available after the service starts.

Using the latest main branch and following the README should ensure it works smoothly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants