-
Notifications
You must be signed in to change notification settings - Fork 469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structured outputs as an alternative to Tool Calling #582
Comments
FYI, here’s an example where import instructor
from datetime import date
from typing import TypedDict
from litellm import completion
class UserProfile(TypedDict, total=False):
name: str
dob: date
bio: str
client = instructor.from_litellm(completion, mode=instructor.Mode.MD_JSON) # Switching to `instructor.Mode.TOOLS` would result in the same error mentioned earlier
user = client.chat.completions.create(
model="gemini/gemini-2.0-flash-exp",
messages=[
{"role": "user", "content": "Generate a synthetic data"},
],
response_model=UserProfile,
)
user yields UserProfile(name='Alice Wonderland', dob=datetime.date(1990, 3, 15), bio='A curious individual who loves to explore and discover new things.') |
@samuelcolvin, could you please take a look at this? My understanding is that we're already using json schemas of models to guide coercing outputs to certain types... |
Yes, but my proposal is actually to have a mode that, instead of using model providers' tool calling api, parse the raw text response representing a json for a given Here is the current implementation of OpenAI models, which parses model's raw response and tool calls separately: pydantic-ai/pydantic_ai_slim/pydantic_ai/models/openai.py Lines 209 to 213 in c53c4e1
Under the proposed json mode, the code may look something like: if choice.message.content is not None:
items.append(result_type.model_validate_json(choice.message.content)) and if the model failed to output a json text or it does not pass validation, retry. |
See #514 which is related. You could implement this now in a custom model, I think that's how I don't think there's any reason to move or copy that logic into |
I'd be open to proposals/PRs with tweaks to the current model implementation that would make it easier to subclass/override and add functionality like this. However, I will note that we can probably improve the handling of schemas with format in their fields independently, I'll open a PR to do that shortly. |
Thanks. I'll explore what can be done, as I still believe this is a crucial feature missing from many frameworks. Its implementation should not introduce significant complexity to the project, as it primarily involves prompting and validating string content using Pydantic models. Moreover, it's broadly applicable across all LLMs. |
Currently, the open source model serving project
But structured output like openAI is supported: Request: {
"model": "qwen2.5-32b",
"temperature": 0.1,
"messages": [
{
"role": "user",
"content": "North city in the US"
}
],
"extra_body": {
"guided_json": {
"properties": {
"city": {
"title": "City",
"type": "string"
},
"country": {
"title": "Country",
"type": "string"
},
"reason": {
"title": "Reason",
"type": "string"
}
},
"required": [
"city",
"country",
"reason"
],
"title": "MyModel",
"type": "object"
}
}
} Output: {
"id": "chatcmpl-3d629978021b407d8163add87355a758",
"created": 1736494263,
"model": "qwen2.5-32b-awq",
"object": "chat.completion",
"system_fingerprint": null,
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "{\"city\": \"Seattle\", \"country\": \"US\", \"reason\": \"Seattle is often referred to as the 'Emerald City' and is located in the northern part of the United States.\"}",
"role": "assistant",
"tool_calls": null,
"function_call": null
}
}
],
"usage": {
"completion_tokens": 42,
"prompt_tokens": 192,
"total_tokens": 234,
"completion_tokens_details": null,
"prompt_tokens_details": null
},
"service_tier": null,
"prompt_logprobs": null
} |
We should support structured outputs as well as tool calls for the |
Looks like #242 is also related? Seems like structured outputs is the way to go since many providers support it natively |
Please note that Structured Output API from these two providers both have limitations: they only support a subset of JSON schema. Attributes like class User(BaseModel):
details: dict[
Annotated[str, Field(description="User name", min_length=1)],
Annotated[int, Field(description="User ID", gt=3)],
] = Field(max_length=1) Its corresponding json schema: {
"properties": {
"details": {
"additionalProperties": {
"description": "User ID",
"exclusiveMinimum": 3,
"type": "integer"
},
"maxProperties": 1,
"propertyNames": {
"description": "User name",
"minLength": 1
},
"title": "Details",
"type": "object"
}
},
"required": ["details"],
"title": "User",
"type": "object"
} Some useful references: |
Issue Description:
Currently,
pydantic-ai
implements structured output solely using tool-calling APIs from model providers. While this works in most cases, certain schemas supported bypydantic
exhibit inconsistencies between model providers.For instance, the following schema from the documentation does not work with Gemini models:
This results in the following error:
In this example, the inconsistency stems from the model provider's limitations. However, based on my observations working with tools like
instructor
, modern LLMs are increasingly proficient at adhering to JSON-format prompts in their text responses. In fact, they often perform better in terms of json content in standard completion modes than in tool-calling modes. The Berkeley Function-Calling Leaderboard may provide further evidence of this trend.Feature Request
Would it be possible for
pydantic-ai
to implement an alternative mode akin toinstructor
'sMD_JSON
mode? This mode could use prompt engineering to guide the LLM’s output and parse the resulting JSON as raw text rather than relying on tool-calling APIs.Such a feature would:
pydantic
's full schema flexibility.Thank you for considering this suggestion!
The text was updated successfully, but these errors were encountered: