Support Streaming MultiTask with response_model #221

Anmol6 · 2023-11-25T22:17:06Z

Resolve #213

Summary by CodeRabbit

New Features
- Added a search functionality to the app.

coderabbitai · 2023-11-25T22:17:14Z

Walkthrough

The changes revolve around enhancing the streaming functionality to work with response_model in a multitask context. The updates allow for a more intuitive developer experience by enabling the direct iteration over streamed responses as instances of a specified model. This is achieved by introducing checks for iterable response models and handling them accordingly within the streaming process.

Changes

File Path	Change Summary
`examples/streaming_multitask/...`	Updated `stream_extract` to return `Iterable[User]` directly from `client.chat.completions.create` and removed `cls` parameter.
`instructor/function_calls.py`	Added `stream_multitask` parameter to `from_response` method for conditional handling based on streaming.
`instructor/patch.py`	Introduced checks for `Iterable` and `MultiTaskBase` in response handling, added `stream` parameter to functions.
`tests/openai/test_multitask.py`	Added new test functions for different modes and updated `stream_extract` usage to include `mode` argument.
`docs/concepts/...`	Renamed `MultiUser` to `Users` and updated usage in documentation.

Assessment against linked issues

Objective	Addressed	Explanation
Support Streaming MultiTask with `response_model` (#213)	✅	The changes include checks for iterable types and proper handling of `MultiTask` and `MultiTaskBase` classes, aligning with the objective to support streaming with `response_model`.

Poem

In the code where streams do flow, 🌊
The User models now align in a row.
With each update, like a hop, 🐇💻
We iterate with ease, non-stop.

Tips

Chat with CodeRabbit Bot (`@coderabbitai`)

If you reply to a review comment from CodeRabbit, the bot will automatically respond.
To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
Note: Review comments are made on code diffs or files, not on the PR overview.
Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

coderabbitai

Review Status

Actionable comments generated: 3

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 359c5f9 and 25e8e32.

Files selected for processing (4)

examples/streaming_multitask/streaming_multitask.py (2 hunks)
instructor/function_calls.py (2 hunks)
instructor/patch.py (6 hunks)
tests/openai/test_multitask.py (2 hunks)

Additional comments: 12

examples/streaming_multitask/streaming_multitask.py (3)

16-17:
The User model now includes an age field. Ensure that all uses of the User model throughout the codebase are updated to handle this new field.

21-26:
The stream_extract function has been successfully refactored to return an Iterable[User] and no longer requires the cls parameter. This change simplifies the function's interface.

45-50:
The usage of stream_extract in the loop has been updated to match its new signature, and the loop correctly processes the iterable of User objects.

instructor/function_calls.py (3)

190-194: The new parameter stream_multitask has been added to the from_response method. Ensure that all calls to this method have been updated to include this new parameter where necessary, or that default behavior is acceptable in existing calls.

206-209: The conditional handling for stream_multitask is implemented correctly. However, ensure that the from_streaming_response method is defined and properly handles the streaming response as expected.

211-213: The message variable is accessed with dot notation, which assumes that completion.choices[0].message is an object with attributes rather than a dictionary. This is inconsistent with the previous usage in the from_response method where completion is treated as a dictionary. Verify that completion is indeed an object with the expected structure or adjust the code to use dictionary accessors.

instructor/patch.py (5)

3-6: The imports for MultiTask, MultiTaskBase, get_origin, get_args, and Iterable are correctly added to support the new streaming multitask functionality. These imports are necessary for the changes described in the summary.

52-60: The handle_response_model function has been updated to handle the Iterable type in the response_model. The logic to create a MultiTask instance based on the iterable element class is correct and aligns with the summary. However, there is a potential issue with the type ignore comment on line 59. It's important to ensure that the openai_schema function can handle the response_model correctly and that the type ignore is justified.

126-137: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [112-136]

The process_response function now includes a stream parameter and handles the response differently if stream is set to True and the response_model is a subclass of MultiTaskBase. This is consistent with the summary and allows for the correct processing of streaming multitask responses.

150-163: The retry_async function has been updated to include the stream parameter and pass it to the process_response function. This ensures that the streaming behavior is consistently applied across retries. However, there is a potential issue with the error handling in the except block. The dump_message function is called with response.choices[0].message, but there is no check to ensure that response.choices and response.choices[0] are not empty, which could lead to an IndexError if the response does not contain any choices.

187-200: Similarly, the retry_sync function has been updated to handle the stream parameter. The same potential issue with error handling in the except block applies here as well. It is necessary to verify that response.choices and response.choices[0] are not empty before attempting to access response.choices[0].message.

tests/openai/test_multitask.py (1)

52-55: The stream_extract function in test_multi_user_legacy is correctly left unchanged as per the summary. However, the cls parameter is no longer used in the new stream_extract function, and the summary states that it was removed. If the intention was to remove it from the legacy function as well, this needs to be addressed. Otherwise, the legacy function is fine as is.

tests/openai/test_multitask.py

jxnl

Well, this is amazing. Do you mind adding a little bit to the documentation?

jxnl

Also does this work for json mode or tools call too?

Anmol6 · 2023-11-25T22:25:38Z

Also does this work for json mode or tools call too?

just merged from main, updating it right now!

jxnl · 2023-11-25T22:55:58Z

examples/streaming_multitask/streaming_multitask.py

-    completion = client.chat.completions.create(
+Users = Iterable[User]
+
+def stream_extract(input: str) -> Iterable[User]:


Suggested change

def stream_extract(input: str) -> Iterable[User]:

def stream_extract(input: str) -> Users

this would work too right? or even response_model=Iterable[User]

love it

yup both work

jxnl · 2023-11-25T22:58:19Z

at least do something that just says only function allowed or etc.

coderabbitai

Review Status

Actionable comments generated: 5

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 25e8e32 and 065c754.

Files selected for processing (5)

docs/concepts/patching.md (1 hunks)
instructor/dsl/multitask.py (2 hunks)
instructor/function_calls.py (2 hunks)
instructor/patch.py (7 hunks)
tests/openai/test_multitask.py (2 hunks)

Additional comments: 11

docs/concepts/patching.md (1)

27-31:
The change from Mode.TOOL_CALL to Mode.TOOLS in the client instantiation is consistent with the updates mentioned in the pull request summary and reflects the new operational mode for streaming multitask operations.

instructor/dsl/multitask.py (5)

3-5:
The import of Mode from instructor.function_calls is correct and necessary for the new mode parameter in MultiTaskBase methods.

12-14:
The from_streaming_response method correctly accepts the mode parameter and uses it to extract JSON chunks based on the mode.

34-51:
The extract_json method has been updated to handle different modes, but there is a potential issue with the assignment of json_chunk inside the if-else blocks. The := operator is used, which is known as the walrus operator and is only available in Python 3.8 and later. If the codebase is intended to support earlier versions of Python, this will cause a syntax error.

31-31:
The use of yield in tasks_from_chunks is correct and aligns with the streaming functionality described in the summary.

47-49:
The NotImplementedError is raised correctly for unsupported modes, ensuring that the function fails fast and clearly when encountering an unexpected mode.

instructor/function_calls.py (4)

190-194: The from_response method has been updated to include a new parameter stream_multitask. This parameter is used to determine if the response should be processed as a streaming multitask operation. Ensure that all calls to this method are updated to handle the new parameter appropriately.

206-209: The conditional logic to handle streaming multitask operations is correctly implemented. When stream_multitask is True, the method delegates to from_streaming_response with the completion and mode parameters. This aligns with the changes described in the summary.

211-213: The message variable is accessed using dot notation (completion.choices[0].message), which assumes that completion is an object with attribute access. However, the previous version of from_response treated completion as a dictionary (completion["choices"][0]["message"]). This discrepancy could lead to an AttributeError if completion is not an object with the expected attributes. Verify that the completion parameter will always be an object that supports attribute access in the updated codebase.

213-213: The assertion within the Mode.FUNCTIONS branch checks the function name against the schema. This is a good practice to ensure that the response corresponds to the expected function. However, the code assumes that message.function_call and message.function_call.name are accessible, which may not be the case if the response format is different. It's important to verify that the response format from OpenAI's API will always match this structure.
tests/openai/test_multitask.py (1)
36-39: The same typo identified in the previous review is still present in the string concatenation. The word "entitites" should be corrected to "entities", and spaces should be added at the end of lines 37 and 38 to ensure proper spacing in the concatenated string.
-                        "Correctly segment it into entitites"
+                        "Correctly segment it into entities "

instructor/patch.py

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 065c754 and 4c76121.

Files selected for processing (1)

examples/streaming_multitask/streaming_multitask.py (2 hunks)

Additional comments: 3

examples/streaming_multitask/streaming_multitask.py (3)

16-17:
The addition of the age field to the User model is consistent with the summary and should be reflected in the documentation and any serialization/deserialization logic that involves the User model.

21-26:
The refactoring of the stream_extract function to return Users and the removal of the cls parameter is consistent with the summary. Ensure that all calls to stream_extract are updated to reflect these changes.

45-50:
The call to stream_extract in the example usage correctly reflects the updated function signature without the cls parameter. This is consistent with the changes described in the summary.

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 4c76121 and 1a27b0f.

Files selected for processing (3)

instructor/function_calls.py (2 hunks)
instructor/patch.py (7 hunks)
tests/openai/test_multitask.py (2 hunks)

Additional comments: 5

instructor/patch.py (4)
99-100: The previous comment chain indicates that the else block at lines 99-100 is unnecessary and should be removed. This change has not been reflected in the provided hunk. If the codebase has not changed in a way that would require this else block, it should be removed as previously suggested.
99        else:
-           raise ValueError(f"Invalid patch mode: {mode}")
107-134: The process_response function has been updated to handle streaming responses. The stream parameter is now used to determine if the response should be processed as a stream and whether the _raw_response attribute should be set. This change aligns with the pull request's goal to enhance streaming and multitasking capabilities. Ensure that all calls to process_response have been updated to pass the stream parameter correctly.

148-158: The retry_async function now accepts and passes the stream parameter to process_response. This change is consistent with the pull request's goal to support streaming responses. Ensure that all calls to retry_async have been updated to include the stream parameter where necessary.

185-195: Similarly, the retry_sync function has been updated to handle the stream parameter, ensuring that synchronous retries are compatible with the new streaming functionality. Ensure that all calls to retry_sync have been updated to include the stream parameter where necessary.
tests/openai/test_multitask.py (1)
34-36: The same issue from the previous review comments about missing spaces and a typo in the word "entities" is still present in the new code. This needs to be corrected to ensure proper string formatting and spelling.
-                        "Correctly segment it into entitites"
+                        "Correctly segment it into entities "

instructor/function_calls.py

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 1a27b0f and a0fc78b.

Files selected for processing (1)

docs/concepts/lists.md (2 hunks)

docs/concepts/lists.md

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between a0fc78b and 16b443c.

Files selected for processing (2)

docs/concepts/lists.md (2 hunks)
instructor/patch.py (6 hunks)

Additional comments: 4

docs/concepts/lists.md (2)

65-68:
The type hint for Users as Iterable[User] is appropriate for the streaming context described in the documentation.

89-94:
The streaming example provided is a simplified demonstration of the new streaming multitask functionality. It is assumed that the reader will understand the context and that a more detailed example will be provided in a more comprehensive documentation section or with additional code comments.

instructor/patch.py (2)

145-155: The stream parameter is correctly passed to the process_response function within the retry_async function. This aligns with the new streaming functionality and ensures that the streaming behavior is consistent across asynchronous retries.

182-192: Similarly, the stream parameter is correctly passed to the process_response function within the retry_sync function. This ensures that the streaming behavior is consistent across synchronous retries as well.

instructor/patch.py

jxnl

Looks awesome! let me know whenever you'd down to merge.

jxnl · 2023-11-26T16:16:05Z

Looks like coverage fell but thats because the openai tests are not run, no need to mock that

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 16b443c and 7ae58fb.

Files selected for processing (1)

docs/concepts/lists.md (4 hunks)

Additional comments: 2

docs/concepts/lists.md (2)

1-4: The title and introduction seem appropriate for the document's content.

13-54: The use of Iterable for defining the Users class and its implementation in the client.chat.completions.create call is correct and follows the Python typing conventions.

coderabbitai · 2023-11-26T21:08:58Z

docs/concepts/lists.md


 ## Streaming Tasks

-Since a `MultiTask(T)` is well contrained to `tasks: List[T]` we can make assuptions on how tokens are used and provide a helper method that allows you generate tasks as the the tokens are streamed in
+We can also generate tasks as the tokens are streamed in by defining an `Iterable[T]` type.

 Lets look at an example in action with the same class

 ```python hl_lines="6 26"
-MultiUser = instructor.MultiTask(User)
+from typing import Iterable
+
+Users = Iterable[User]

-completion = client.chat.completions.create(
+users = client.chat.completions.create(
    model="gpt-4",
    temperature=0.1,
    stream=True,
-    response_model=MultiUser,
+    response_model=Users,
    messages=[
        {
            "role": "system",


Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [67-100]

The streaming example uses an undefined variable input, which may lead to confusion. It should be defined or the example should be clarified to indicate that input is a placeholder for actual data.

coderabbitai · 2023-11-26T21:08:58Z

docs/concepts/lists.md

    max_tokens=1000,
 )

-for user in MultiUser.from_streaming_response(completion):
+for user in users:
    assert isinstance(user, User)
    print(user)


The note about streaming being a prototype is important and should be highlighted or made more prominent to ensure users are aware of its experimental status.

coderabbitai

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 7ae58fb and eb30b7e.

Files ignored due to filter (1)

pyproject.toml

Files selected for processing (2)

instructor/function_calls.py (2 hunks)
tests/openai/test_multitask.py (2 hunks)

Files skipped from review due to trivial changes (1)

instructor/function_calls.py

Additional comments: 1

tests/openai/test_multitask.py (1)

127-133: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [105-133]

Verify that the MultiUser.from_streaming_response method is compatible with the Mode.FUNCTIONS parameter as used in the test_multi_user_legacy function.

tests/openai/test_multitask.py

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between eb30b7e and 5bf716e.

Files selected for processing (2)

docs/concepts/lists.md (3 hunks)
instructor/patch.py (6 hunks)

Additional comments: 9

docs/concepts/lists.md (4)

1-4: The introduction is clear and concise, providing a good lead-in to the topic of multi-task and streaming.

67-67: Verify if the markdown renderer supports hl_lines for highlighting lines in code blocks, as this feature is not standard in all markdown renderers.

79-79: The use of the undefined variable input in the example code is still present and should be addressed.

91-97: The note about streaming being a prototype should be made more prominent to ensure users are aware of its experimental status.
instructor/patch.py (5)
62-62: The error message in the NotImplementedError is misleading because the check is for whether the response_model is a subclass of MultiTaskBase, not whether it is iterable. The error message should be updated to accurately reflect the condition being checked.
- raise NotImplementedError("stream=True is not supported when using response_model parameter for non-iterables")
+ raise NotImplementedError("stream=True is not supported unless the response_model is a subclass of MultiTaskBase")
97-98: The else block at lines 97-98 is redundant since all possible values of mode are already handled by the preceding if and elif statements. If a new mode is added in the future, it should be handled explicitly. The else block can be removed to avoid confusion and potential dead code.
97        else:
-           raise ValueError(f"Invalid patch mode: {mode}")
123-133: The process_response function has been updated to handle multitask models and return tasks if stream=False. Ensure that the logic for handling the stream and multitask models is correct and that the raw response is returned when stream is False.

147-157: The retry_async function has been updated to include a stream parameter and pass it to the process_response function. Ensure that the stream parameter is being handled correctly and passed to the process_response function as intended.

184-194: The retry_sync function has been updated to include a stream parameter and pass it to the process_response function. Ensure that the stream parameter is being handled correctly and passed to the process_response function as intended.

first commit

25e8e32

coderabbitai bot reviewed Nov 25, 2023

View reviewed changes

tests/openai/test_multitask.py Outdated Show resolved Hide resolved

tests/openai/test_multitask.py Outdated Show resolved Hide resolved

tests/openai/test_multitask.py Show resolved Hide resolved

jxnl reviewed Nov 25, 2023

View reviewed changes

Anmol6 changed the title ~~Support Streaming MultiTask with response_model #213~~ Support Streaming MultiTask with response_model (Issue #213) Nov 25, 2023

jxnl reviewed Nov 25, 2023

View reviewed changes

Anmol6 changed the title ~~Support Streaming MultiTask with response_model (Issue #213)~~ Support Streaming MultiTask with response_model (resolves #213) Nov 25, 2023

update warning

da0b56e

jxnl reviewed Nov 25, 2023

View reviewed changes

support all modes

065c754

coderabbitai bot reviewed Nov 25, 2023

View reviewed changes

instructor/patch.py Outdated Show resolved Hide resolved

instructor/patch.py Outdated Show resolved Hide resolved

instructor/patch.py Outdated Show resolved Hide resolved

instructor/patch.py Outdated Show resolved Hide resolved

smol change

4c76121

coderabbitai bot reviewed Nov 25, 2023

View reviewed changes

formatting

17c99ea

Anmol6 requested a review from jxnl November 25, 2023 23:48

cleanup

1a27b0f

Anmol6 changed the title ~~Support Streaming MultiTask with response_model (resolves #213)~~ Support Streaming MultiTask with response_model Nov 26, 2023

coderabbitai bot reviewed Nov 26, 2023

View reviewed changes

instructor/function_calls.py Show resolved Hide resolved

update docs

a0fc78b

coderabbitai bot reviewed Nov 26, 2023

View reviewed changes

docs/concepts/lists.md Outdated Show resolved Hide resolved

Anmol6 added 2 commits November 25, 2023 16:36

improve readability

73045b3

Merge branch 'main' into add_multitask_stream

16b443c

coderabbitai bot reviewed Nov 26, 2023

View reviewed changes

instructor/patch.py Show resolved Hide resolved

instructor/patch.py Show resolved Hide resolved

jxnl approved these changes Nov 26, 2023

View reviewed changes

update docs

7ae58fb

coderabbitai bot reviewed Nov 26, 2023

View reviewed changes

bump to 0.4.0

03d3fc7

Merge branch 'main' into add_multitask_stream

eb30b7e

coderabbitai bot reviewed Nov 26, 2023

View reviewed changes

tests/openai/test_multitask.py Show resolved Hide resolved

make non stream multitask iterable

5bf716e

coderabbitai bot reviewed Nov 26, 2023

View reviewed changes

jxnl merged commit 66f2f63 into instructor-ai:main Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Streaming MultiTask with response_model #221

Support Streaming MultiTask with response_model #221

Anmol6 commented Nov 25, 2023 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 25, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

coderabbitai bot left a comment

jxnl left a comment

jxnl left a comment

Anmol6 commented Nov 25, 2023

jxnl Nov 25, 2023

Anmol6 Nov 25, 2023 •

edited

Loading

jxnl commented Nov 25, 2023

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

jxnl left a comment

jxnl commented Nov 26, 2023

coderabbitai bot left a comment

coderabbitai bot Nov 26, 2023

coderabbitai bot Nov 26, 2023

coderabbitai bot left a comment

coderabbitai bot left a comment

	def stream_extract(input: str) -> Iterable[User]:
	def stream_extract(input: str) -> Users

Support Streaming MultiTask with response_model #221

Support Streaming MultiTask with response_model #221

Conversation

Anmol6 commented Nov 25, 2023 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Nov 25, 2023 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Poem

Chat with CodeRabbit Bot (@coderabbitai)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

coderabbitai bot left a comment

Choose a reason for hiding this comment

jxnl left a comment

Choose a reason for hiding this comment

jxnl left a comment

Choose a reason for hiding this comment

Anmol6 commented Nov 25, 2023

jxnl Nov 25, 2023

Choose a reason for hiding this comment

Anmol6 Nov 25, 2023 • edited Loading

Choose a reason for hiding this comment

jxnl commented Nov 25, 2023

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

jxnl left a comment

Choose a reason for hiding this comment

jxnl commented Nov 26, 2023

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2023

Choose a reason for hiding this comment

coderabbitai bot Nov 26, 2023

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

Anmol6 commented Nov 25, 2023 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 25, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Configration File (`.coderabbit.yaml`)

Anmol6 Nov 25, 2023 •

edited

Loading