-
-
Notifications
You must be signed in to change notification settings - Fork 730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Streaming MultiTask with response_model #221
Merged
Merged
Changes from 10 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
25e8e32
first commit
Anmol6 da0b56e
update warning
Anmol6 065c754
support all modes
Anmol6 4c76121
smol change
Anmol6 17c99ea
formatting
Anmol6 1a27b0f
cleanup
Anmol6 a0fc78b
update docs
Anmol6 73045b3
improve readability
Anmol6 16b443c
Merge branch 'main' into add_multitask_stream
Anmol6 7ae58fb
update docs
Anmol6 03d3fc7
bump to 0.4.0
Anmol6 eb30b7e
Merge branch 'main' into add_multitask_stream
Anmol6 5bf716e
make non stream multitask iterable
Anmol6 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
# Streaming and MultiTask | ||
# Multi-task and Streaming | ||
|
||
A common use case of structured extraction is defining a single schema class and then making another schema to create a list to do multiple extraction | ||
|
||
|
@@ -13,40 +13,44 @@ class Users(BaseModel): | |
users: List[User] | ||
``` | ||
|
||
Defining a task and creating a list of classes is a common enough pattern that we define a helper function `MultiTask` It procides a function to dynamically create a new class that: | ||
Defining a task and creating a list of classes is a common enough pattern that we make this convenient by making use of `Iterable[T]`. This lets us dynamically create a new class that: | ||
|
||
1. Dynamic docstrings and class name baed on the task | ||
2. Helper method to support streaming by collectin function_call tokens until a object back out. | ||
1. Has dynamic docstrings and class name based on the task | ||
2. Support streaming by collecting tokens until a task is received back out. | ||
|
||
## Extracting Tasks using MultiTask | ||
## Extracting Tasks using Iterable | ||
|
||
By using multitask you get a very convient class with prompts and names automatically defined. You get `from_response` just like any other `BaseModel` you're able to extract the list of objects data you want with `MultTask.tasks`. | ||
By using `Iterable` you get a very convient class with prompts and names automatically defined: | ||
|
||
```python | ||
import instructor | ||
from openai import OpenAI | ||
from typing import Iterable | ||
from pydantic import BaseModel | ||
|
||
client = instructor.patch(OpenAI()) | ||
client = instructor.patch(OpenAI(), mode=instructor.function_calls.Mode.JSON) | ||
|
||
class User(BaseModel): | ||
name: str | ||
age: int | ||
|
||
MultiUser = instructor.MultiTask(User) | ||
Users = Iterable[User] | ||
|
||
completion = client.chat.completions.create( | ||
model="gpt-4-0613", | ||
users = client.chat.completions.create( | ||
model="gpt-3.5-turbo-1106", | ||
temperature=0.1, | ||
response_model=Users, | ||
stream=False, | ||
functions=[MultiUser.openai_schema], | ||
function_call={"name": MultiUser.openai_schema["name"]}, | ||
messages=[ | ||
{ | ||
"role": "user", | ||
"content": f"Consider the data below: Jason is 10 and John is 30", | ||
"content": "Consider this data: Jason is 10 and John is 30.\ | ||
Correctly segment it into entitites\ | ||
Make sure the JSON is correct", | ||
}, | ||
], | ||
) | ||
users.model_dump_json() | ||
``` | ||
|
||
```json | ||
|
@@ -60,18 +64,20 @@ completion = client.chat.completions.create( | |
|
||
## Streaming Tasks | ||
|
||
Since a `MultiTask(T)` is well contrained to `tasks: List[T]` we can make assuptions on how tokens are used and provide a helper method that allows you generate tasks as the the tokens are streamed in | ||
We can also generate tasks as the tokens are streamed in by defining an `Iterable[T]` type. | ||
|
||
Lets look at an example in action with the same class | ||
|
||
```python hl_lines="6 26" | ||
MultiUser = instructor.MultiTask(User) | ||
from typing import Iterable | ||
|
||
Users = Iterable[User] | ||
|
||
completion = client.chat.completions.create( | ||
users = client.chat.completions.create( | ||
model="gpt-4", | ||
temperature=0.1, | ||
stream=True, | ||
response_model=MultiUser, | ||
response_model=Users, | ||
messages=[ | ||
{ | ||
"role": "system", | ||
|
@@ -89,7 +95,7 @@ completion = client.chat.completions.create( | |
max_tokens=1000, | ||
) | ||
|
||
for user in MultiUser.from_streaming_response(completion): | ||
for user in users: | ||
assert isinstance(user, User) | ||
print(user) | ||
Comment on lines
91
to
96
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The note about streaming being a prototype is important and should be highlighted or made more prominent to ensure users are aware of its experimental status. |
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The streaming example uses an undefined variable
input
, which may lead to confusion. It should be defined or the example should be clarified to indicate thatinput
is a placeholder for actual data.