-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wasted Good models #758
Comments
I don't understand what you're asking, please can you provide more details. Also please no all caps. |
We have several promising models, such as Phi and DeepSeek R1, that excel in various aspects. However, these models do not natively support tool use, as they lack the necessary token structures for delimiting tool use queries and responses. That said, with a well-designed prompt, these models can be adapted to produce structured outputs suitable for tool use. Given that Pydantic AI is heavily optimized for tool use and works best with structured outputs, we are currently underutilizing these high-quality models by not integrating them effectively. The key challenge is designing the right prompting and parsing strategy to bridge this gap. With the right approach, we can make non-tool-use models work seamlessly within a tool-using framework like Pydantic AI, maximizing their potential. |
The adapter can be on the API level, creating a generous tool for everyone to map model answers to tool use and back, or it can go inside the flows of pydantic AI itself |
Would "well-designed prompt" be as reliable as |
@kerolos-gadalla I think the main issue you are having is probably how to use the Agent class correctly. I don't believe these models (DeepSeek R1, Phi3, Phi4 etc) are wasted. I am able to use the Agent class with pretty much any model I choose to. You just have to be conscious of the capabilities and limitations of the LLM you are using.
The smaller models have clear limitations but yet developers keep trying to make them sing in Soprano when they can't even get out of the Bass/Baritone range. We are seeing a barrage of Github issues with structured output and tool calling with these small models and I think it is coming from a place of not understanding the model capabilities and how to use the LLMs with the Agent class. I hope this explains it better. |
Duplicate of #582, we intend to support structured outputs as well as tool calls for structured results types. |
Can we find a way to adapt non tool use models into the pydantac AI tool use, It should be pretty much just prompting and parsing,
The text was updated successfully, but these errors were encountered: