Skip to content

Add support for "extra_body" to OpenAILLMConfigEntry #1590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 18, 2025

Conversation

Hellisotherpeople
Copy link
Contributor

This will enable min_p sampling and other fancy vllm stuff for AG2.

Why are these changes needed?

OpenAILLMConfigEntry is used for all OpenAI compliant APIs. vLLMs server is just one of those APIs. vllm also supports a bunch of cool stuff, such as min_p sampling, robust max/min tokens support. This is all specified in the "extra_body" - but since this was not supported by OpenAILLMConfigEntry, it was not possible to use these techniques with local models.

I've tested this change with several models on vllm, including mistral large, llama4 maverick, and deepseek.

…sampling and other fancy vllm stuff.

OpenAILLMConfigEntry is used for all OpenAI compliant APIs. vLLMs server is just one of those APIs. vllm also supports a bunch of cool stuff, such as min_p sampling, robust max/min tokens support. This is all specified in the "extra_body" - but since this was not supported by OpenAILLMConfigEntry, it was not possible to use these techniques with local models. 

I've tested this change with several models on vllm, including mistral large, llama4 maverick, and deepseek.
@CLAassistant
Copy link

CLAassistant commented Apr 9, 2025

CLA assistant check
All committers have signed the CLA.

@Hellisotherpeople
Copy link
Contributor Author

I will shortly add documentation and tests to this PR

@merlintang
Copy link

this is what we want.

@davorrunje davorrunje self-assigned this Apr 10, 2025
@davorrunje davorrunje self-requested a review April 10, 2025 07:11
@davorrunje davorrunje marked this pull request as draft April 10, 2025 07:12
@davorrunje
Copy link
Collaborator

@Hellisotherpeople Thank you for the PR! I am waiting for docs and tests.

@davorrunje davorrunje requested a review from kumaranvpl April 10, 2025 07:13
@davorrunje davorrunje assigned kumaranvpl and unassigned davorrunje Apr 10, 2025
@davorrunje davorrunje removed their request for review April 10, 2025 07:14
Copy link
Collaborator

@kumaranvpl kumaranvpl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. If you are adding extra_body, would you like to add extra_headers as well?

@davorrunje davorrunje marked this pull request as ready for review April 10, 2025 09:20
@davorrunje davorrunje enabled auto-merge April 10, 2025 13:39
@Hellisotherpeople
Copy link
Contributor Author

What's the best way to do tests for this? To test this, I figured that I need to point to a vllm server, pass in something like min_p sampling, and verify that I get a (valid) response.

If we do not want to point to any vllm server or run one, what's the ideal way to test this?

@kumaranvpl
Copy link
Collaborator

What's the best way to do tests for this? To test this, I figured that I need to point to a vllm server, pass in something like min_p sampling, and verify that I get a (valid) response.

If we do not want to point to any vllm server or run one, what's the ideal way to test this?

@Hellisotherpeople For now, just try to set some valid parameters for extra_body and assert that the model_dump is same as expected. Similar to this - https://github.com/ag2ai/ag2/blob/main/test/oai/test_client.py#L384-L399

@kumaranvpl
Copy link
Collaborator

Looks good to me. If you are adding extra_body, would you like to add extra_headers as well?

@Hellisotherpeople Please take a look at my earlier comment

Merged via the queue into ag2ai:main with commit f8e5565 Apr 18, 2025
12 checks passed
Copy link

codecov bot commented Apr 18, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ There is a different number of reports uploaded between BASE (9732e25) and HEAD (8711815). Click for more details.

HEAD has 75 uploads less than BASE
Flag BASE (9732e25) HEAD (8711815)
core-without-llm 14 1
3.9 11 0
ubuntu-latest 14 1
3.13 3 0
macos-latest 4 0
3.11 3 1
3.10 3 0
3.12 3 0
windows-latest 5 0
core-llm 9 0
cerebras 1 0
ollama 1 0
bedrock 1 0
gemini 1 0
openai-realtime 1 0
gemini-realtime 1 0
anthropic 1 0
deepseek 1 0
openai 1 0
Files with missing lines Coverage Δ
autogen/oai/client.py 53.60% <100.00%> (-23.54%) ⬇️

... and 64 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants