blog: Third party models #284

Anmol6 · 2023-12-16T23:36:48Z

Summary by CodeRabbit

New Features
- Introduced a new blog post on integrating Anyscale with Instructor.
- Enhanced Anyscale's OpenAI client with response_model and max_retries options.
Documentation
- Updated blog post links and corrected filenames.
- Made textual and formatting improvements to various documentation files.
- Removed "API Usage Monitoring" feature from CLI documentation.
- Adjusted content related to "Model Fine-Tuning" in CLI documentation.
Bug Fixes
- Fixed minor textual issues in CLI documentation, including spacing, punctuation, and capitalization.
Refactor
- Streamlined from_response_async function by removing throw_error parameter.

coderabbitai · 2023-12-16T23:36:52Z

Walkthrough

The recent updates involve a mix of content and functionality changes. A new blog post about Anyscale's integration with Instructor has been added, and a typo in a blog post link was corrected. The documentation saw minor text adjustments, including the removal of a feature related to API usage monitoring and a change in the error handling behavior of a function in the instructor module. These edits refine user documentation and code functionality without introducing major shifts in logic.

Changes

File Path	Change Summary
`docs/blog/.../index.md`	Added a new blog post; corrected a blog post link filename.
`docs/blog/.../anyscale.md`	Introduced Anyscale's Mistral model support for structured outputs.
`docs/blog/.../introduction.md`	Minor textual changes; removed whitespace.
`docs/cli/.../finetune.md`	Textual adjustments; fixed spacing and punctuation.
`docs/cli/index.md` `docs/cli/usage.md`	Removed "API Usage Monitoring" feature; minor text modifications.
`instructor/function_calls.py`	Removed `throw_error` parameter from `from_response_async` function.

🐇✨
To celebrate the code and prose,
We hopped through docs, leaving those
Small tweaks and fixes, clear as day,
Now let's hop on, hip hip hooray! 🎉
🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?

Tips

Chat with CodeRabbit Bot (`@coderabbitai`)

You can reply to a review comment made by CodeRabbit.
You can tag CodeRabbit on specific lines of code or files in the PR by tagging @coderabbitai in a comment.
You can tag @coderabbitai in a PR comment and ask one-off questions about the PR and the codebase. Use quoted replies to pass the context for follow-up questions.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

coderabbitai

Review Status

Actionable comments generated: 3

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between f0d76ac and 744240c.

Files selected for processing (7)

docs/blog/index.md (1 hunks)
docs/blog/posts/anyscale.md (1 hunks)
docs/blog/posts/introduction.md (9 hunks)
docs/cli/finetune.md (6 hunks)
docs/cli/index.md (1 hunks)
docs/cli/usage.md (2 hunks)
instructor/function_calls.py (1 hunks)

Files skipped from review due to trivial changes (3)

docs/blog/index.md
docs/cli/finetune.md
docs/cli/index.md

Additional comments: 7

docs/blog/posts/anyscale.md (3)
15-17: The introduction provides a clear overview of the blog post's topic and sets expectations for the reader.

21-24: The description of the patching features is concise and informative, explaining the enhancements to the OpenAI API client.

75-75: Verify that the provided documentation link is correct and leads to the relevant information about Anyscale's output mode support.

Verification successful

The provided documentation link https://docs.endpoints.anyscale.com/ is correct and leads to an accessible webpage, as indicated by the HTTP status code 200 which means OK. Therefore, the link is valid and should lead to the relevant information about Anyscale's output mode support as mentioned in the blog post.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify that the documentation link is valid and accessible.
curl -o /dev/null -s -w "%{http_code}\n" https://docs.endpoints.anyscale.com/
Length of output: 83
docs/blog/posts/introduction.md (2)

2-2: The change of the draft field from True to False indicates that this blog post is now published. If this is intentional, please ensure that all content is finalized and ready for public viewing.

188-188: The addition of a call-to-action for readers to star the instructor GitHub repository is a good practice for community engagement.

docs/cli/usage.md (1)

54-56: The addition of the "Contributions" section is a good practice for encouraging community engagement in open-source projects.
instructor/function_calls.py (1)
174-179: The removal of the throw_error parameter from the from_response_async method changes the function signature. Ensure that all calls to this method throughout the codebase have been updated to match the new signature.

Verification successful

The previous script contained an error that led to a misinterpretation of the regex pattern as a file path. I will correct the script to properly search for the function calls that include the throw_error parameter.

The script has identified calls to from_response_async in tests/test_function_calls.py and instructor/patch.py. To ensure that the calls to this method have been updated to match the new signature without the throw_error parameter, I will generate a script to inspect the parameters passed in these calls.

The script output does not provide the necessary details to confirm if the calls to from_response_async have been updated to match the new signature. I will generate a script to extract the lines of code where from_response_async is called along with the parameters passed to it.

Based on the provided context, the calls to from_response_async in tests/test_function_calls.py and instructor/patch.py have been updated to match the new signature without the throw_error parameter. No further action is required regarding these calls.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for the old method signature to find potential calls that need updating.
rg 'from_response_async\(' --pcre2 '.*throw_error.*'
Length of output: 110

Script:
#!/bin/bash
# Search for the old method signature to find potential calls that need updating.
rg 'from_response_async\(' --pcre2
Length of output: 287

Script:
#!/bin/bash
# Inspect the calls to `from_response_async` in the identified files to ensure they match the updated signature.
rg 'from_response_async\(' tests/test_function_calls.py instructor/patch.py --pcre2
Length of output: 272

Script:
#!/bin/bash
# Extract the lines of code where `from_response_async` is called along with the parameters passed to it.
rg 'await .*\.from_response_async\(' tests/test_function_calls.py instructor/patch.py -C 3
Length of output: 889

coderabbitai · 2023-12-17T03:03:49Z

docs/blog/posts/introduction.md

 class UserDetail(pydantic.BaseModel):
    name: str
    age: int
-    
+
    def introduce(self):
        return f"Hello I'm {self.name} and I'm {self.age} years old"


The UserDetail class is defined multiple times with different attributes throughout the document. If these are meant to be separate examples, consider renaming the classes to avoid confusion. If they are supposed to be the same class, ensure consistency in the class definition.

Also applies to: 93-94, 107-107, 123-123, 139-139

coderabbitai · 2023-12-17T03:03:49Z

docs/blog/posts/anyscale.md

+```python
+from openai import OpenAI
+from pydantic import BaseModel
+
+import instructor
+
+
+class UserDetails(BaseModel):
+    name: str
+    age: int
+
+# enables `response_model` in create call
+client = instructor.patch(
+    OpenAI(
+        base_url="https://api.endpoints.anyscale.com/v1",
+        api_key="<YOUR_ANYSCALE_API_KEY>
+    ),
+    # This uses Anyscale's json schema output mode
+    mode=instructor.Mode.JSON_SCHEMA
+)
+
+resp = client.chat.completions.create(
+    model="mistralai/Mixtral-8x7B-Instruct-v0.1",
+    messages=[
+        {
+            "role": "system",
+            "content": "You are a world class extractor"
+        },
+        {
+            "role": "user",
+            "content": 'Extract the following entities: "Jason is 20"'
+        },
+    ],
+    response_model=UserDetails,
+)
+print(resp)
+>>> name='Jason' age=20


The code snippet is clear and demonstrates the usage of the instructor.patch method with the OpenAI client, including the new response_model and max_retries features. However, there is a syntax error in the api_key string on line 51, which is missing the closing quotation mark.

- api_key="<YOUR_ANYSCALE_API_KEY> + api_key="<YOUR_ANYSCALE_API_KEY>"

Committable suggestion

❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

```python

from openai import OpenAI

from pydantic import BaseModel

import instructor

class UserDetails(BaseModel):

name: str

age: int

# enables `response_model` in create call

client = instructor.patch(

OpenAI(

base_url="https://api.endpoints.anyscale.com/v1",

api_key="<YOUR_ANYSCALE_API_KEY>

),

# This uses Anyscale's json schema output mode

mode=instructor.Mode.JSON_SCHEMA

)

resp = client.chat.completions.create(

model="mistralai/Mixtral-8x7B-Instruct-v0.1",

messages=[

{

"role": "system",

"content": "You are a world class extractor"

},

{

"role": "user",

"content": 'Extract the following entities: "Jason is 20"'

},

],

response_model=UserDetails,

)

print(resp)

>>> name='Jason' age=20

from openai import OpenAI

from pydantic import BaseModel

import instructor

class UserDetails(BaseModel):

name: str

age: int

# enables `response_model` in create call

client = instructor.patch(

OpenAI(

base_url="https://api.endpoints.anyscale.com/v1",

api_key="<YOUR_ANYSCALE_API_KEY>"

),

# This uses Anyscale's json schema output mode

mode=instructor.Mode.JSON_SCHEMA

)

resp = client.chat.completions.create(

model="mistralai/Mixtral-8x7B-Instruct-v0.1",

messages=[

{

"role": "system",

"content": "You are a world class extractor"

},

{

"role": "user",

"content": 'Extract the following entities: "Jason is 20"'

},

],

response_model=UserDetails,

)

print(resp)

>>> name='Jason' age=20

coderabbitai · 2023-12-17T03:03:50Z

instructor/function_calls.py

            completion (openai.ChatCompletion): The response from an openai chat completion
-            throw_error (bool): Whether to throw an error if the function call is not detected
            validation_context (dict): The validation context to use for validating the response
            strict (bool): Whether to use strict json parsing
            mode (Mode): The openai completion mode


The docstring for from_response_async still references the throw_error parameter, which has been removed from the method signature. The docstring should be updated to accurately reflect the current parameters.

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits

Files that changed from the base of the PR and between 78a8905 and 1ebc9eb.

Files selected for processing (7)

docs/blog/index.md (1 hunks)
docs/blog/posts/anyscale.md (1 hunks)
docs/blog/posts/introduction.md (9 hunks)
docs/cli/finetune.md (6 hunks)
docs/cli/index.md (1 hunks)
docs/cli/usage.md (2 hunks)
instructor/function_calls.py (1 hunks)

Files skipped from review due to trivial changes (5)

docs/blog/index.md
docs/blog/posts/introduction.md
docs/cli/finetune.md
docs/cli/index.md
docs/cli/usage.md

Additional comments: 1

docs/blog/posts/anyscale.md (1)

15-75: The blog post content appears to be well-structured and informative, explaining the use of Anyscale's Mistral model and structured outputs with JSON schema. However, there are a few points to consider:

The code snippet provided in lines 36-72 should be verified for accuracy and tested to ensure it works as intended. This is crucial since users will likely try to run the example code.

The link provided in line 28 should be verified to ensure it correctly points to the intended documentation. Broken links in documentation can lead to a poor user experience.

The API key placeholder in line 51 should be clearly marked as such to prevent any confusion. It might be beneficial to include a note or comment indicating that users need to replace <YOUR_ANYSCALE_API_KEY> with their actual API key.

The output example in line 72 should be verified to ensure it accurately represents the expected output of the code snippet. This helps set the right expectations for users.

The link in line 75 should also be verified for correctness and to ensure it provides additional relevant information about Anyscale's output mode support.

coderabbitai · 2023-12-17T03:05:19Z

instructor/function_calls.py


        Parameters:
            completion (openai.ChatCompletion): The response from an openai chat completion
-            throw_error (bool): Whether to throw an error if the function call is not detected
            validation_context (dict): The validation context to use for validating the response
            strict (bool): Whether to use strict json parsing
            mode (Mode): The openai completion mode


💡 NOTE
Codebase verification is a beta feature.

The search results confirm that there is only one instance of from_response_async being called in the tests/test_function_calls.py file, and it is with a single argument. Since no other definition of a from_response_async method within the TestModel class was found, it appears that the test has not been updated to reflect the new function signature from instructor/function_calls.py, which now requires four arguments.

This could be an oversight and might need to be addressed to ensure the test is functioning correctly. All other calls to from_response_async in the codebase have been verified and are consistent with the updated function signature.

The call to from_response_async in tests/test_function_calls.py has not been updated to match the new function signature and may require correction.

Analysis chain

The removal of the throw_error parameter from the from_response_async function signature is a breaking change. Ensure that all calls to this function have been updated to reflect this change, and that error handling is correctly implemented where needed.

Based on the output, it seems there are no calls to from_response_async with five arguments, which suggests that the throw_error parameter has been removed from all calls. However, to be thorough, I will now check for any calls to from_response_async with any number of arguments to confirm that it is being used correctly with the new signature.

Given that the ast-grep searches have not produced any results, I will now use a text-based search tool to look for the string pattern of the function usage. This will help to confirm whether the function is being called with the correct number of arguments in the codebase.

The rg search has identified two calls to from_response_async in tests/test_function_calls.py and instructor/patch.py, in addition to the function definition itself. To verify that these calls have been updated to match the new function signature, I will now extract the actual lines of code to ensure that the correct number of arguments is being passed.

The context provided shows partial calls to from_response_async in tests/test_function_calls.py and instructor/patch.py. However, to confirm that these calls have been correctly updated to match the new function signature, I need to see the full lines of code for each call. I will now extract the full lines to count the arguments.

The full lines of code extracted show that in tests/test_function_calls.py, from_response_async is called with a single argument, which does not match the expected four arguments. This could be an issue unless the test_model class defines a different from_response_async method that accepts one argument. In instructor/patch.py, the function is called with three arguments, and there might be a fourth argument that is not shown.

To complete the verification, I will now check for the presence of a fourth argument in the call within instructor/patch.py and also verify if the test_model class in tests/test_function_calls.py has a different from_response_async method.

The verification of the call in instructor/patch.py confirms that it has been correctly updated to match the new function signature with four arguments, which is consistent with the changes made to from_response_async.

However, the call in tests/test_function_calls.py shows from_response_async being used with a single argument. Without additional context about the test_model class, it's not clear whether this is an issue or if test_model has its own from_response_async method that accepts one argument.

To resolve this, further inspection of the test_model class definition is needed to determine if there is a discrepancy.

The search for the test_model class definition in tests/test_function_calls.py did not yield any results, which suggests that the class may not be defined in that file or that the search pattern was not accurate. To ensure a thorough verification, I will now perform a broader search to match any class definition in the file, which might include test_model.

The search results have identified a TestModel class that inherits from OpenAISchema with attributes name and data. However, there is no indication of a from_response_async method within the provided context. To confirm whether the TestModel class has its own from_response_async method or if the single-argument call is incorrect, I need to search the entire file for any definition of this method within the TestModel class.

Scripts Executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for usages of `from_response_async` to verify that the `throw_error` parameter has been removed and error handling is adjusted. ast-grep --lang python --pattern $'from_response_async($_, $_, $_, $_, $_)'

Length of output: 76

Script:

#!/bin/bash # Search for any usages of `from_response_async` to verify that it is being called with the correct number of arguments. ast-grep --lang python --pattern $'from_response_async($$$)'

Length of output: 61

Script:

#!/bin/bash # Use ripgrep to search for the string pattern of `from_response_async` to confirm its usage. rg 'from_response_async\('

Length of output: 279

Script:

#!/bin/bash # Extract the actual lines of code where `from_response_async` is called to verify the number of arguments passed. rg 'from_response_async\(' --context 2

Length of output: 966

Script:

#!/bin/bash # Extract the full lines of code where `from_response_async` is called to verify the number of arguments passed. rg 'from_response_async\(' -A 3 -B 1

Length of output: 938

Script:

#!/bin/bash # Verify the full call in `instructor/patch.py` and check if `test_model` in `tests/test_function_calls.py` has a different `from_response_async` method. rg 'from_response_async\(' -A 4 -B 1 instructor/patch.py rg 'from_response_async' tests/test_function_calls.py -A 1 -B 2

Length of output: 610

Script:

#!/bin/bash # Inspect the `test_model` class definition to see if it has its own `from_response_async` method. rg 'class .*test_model' tests/test_function_calls.py -A 10

Length of output: 60

Script:

#!/bin/bash # Perform a broader search to match any class definition in the `tests/test_function_calls.py` file. rg 'class ' tests/test_function_calls.py -A 10

Length of output: 1032

Script:

#!/bin/bash # Search within the entire `tests/test_function_calls.py` file for any definition of `from_response_async` within the `TestModel` class. rg 'from_response_async' tests/test_function_calls.py

Length of output: 133

With the throw_error parameter removed, it's important to confirm that the function's error handling logic is still correctly implemented and that any necessary adjustments have been made to handle errors that were previously managed by the throw_error functionality.

bloggg

6003aa5

jxnl added 2 commits December 16, 2023 21:50

format

9fd97d8

formatting

744240c

jxnl marked this pull request as ready for review December 17, 2023 03:00

Merge branch 'main' into open-source-blog

1ebc9eb

jxnl merged commit 9cf9e53 into instructor-ai:main Dec 17, 2023
6 checks passed

jxnl changed the title ~~Third party models~~ blog: Third party models Dec 17, 2023

coderabbitai bot reviewed Dec 17, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blog: Third party models #284

blog: Third party models #284

Anmol6 commented Dec 16, 2023 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

coderabbitai bot left a comment

coderabbitai bot Dec 17, 2023

coderabbitai bot Dec 17, 2023

coderabbitai bot Dec 17, 2023

coderabbitai bot left a comment

coderabbitai bot Dec 17, 2023

blog: Third party models #284

blog: Third party models #284

Conversation

Anmol6 commented Dec 16, 2023 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Dec 16, 2023 • edited Loading

Walkthrough

Changes

Chat with CodeRabbit Bot (@coderabbitai)

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 17, 2023

Choose a reason for hiding this comment

coderabbitai bot Dec 17, 2023

Choose a reason for hiding this comment

coderabbitai bot Dec 17, 2023

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 17, 2023

Choose a reason for hiding this comment

Anmol6 commented Dec 16, 2023 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 16, 2023 •

edited

Loading

Chat with CodeRabbit Bot (`@coderabbitai`)

CodeRabbit Configration File (`.coderabbit.yaml`)