Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update and allow strict mode #618

Merged
merged 1 commit into from
Apr 28, 2024
Merged

feat: update and allow strict mode #618

merged 1 commit into from
Apr 28, 2024

Conversation

jxnl
Copy link
Collaborator

@jxnl jxnl commented Apr 21, 2024

addresses #612


🚀 This description was created by Ellipsis for commit 291e3e5

Summary:

The pull request introduces a new strict parameter, defaulting to True, to several methods in the Instructor and AsyncInstructor classes and to the new_create_async and new_create_sync functions.

Key points:

  • Added a strict parameter to several methods in the Instructor and AsyncInstructor classes in instructor/client.py.
  • Added a strict parameter to the new_create_async and new_create_sync functions in instructor/patch.py.
  • The strict parameter is a boolean that defaults to True.

Generated with ❤️ by ellipsis.dev

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. enhancement New feature or request labels Apr 21, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested.

  • Reviewed the entire pull request up to 291e3e5
  • Looked at 170 lines of code in 2 files
  • Took 35 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 0 additional comments because they didn't meet confidence threshold of 85%.

Workflow ID: wflow_DhD3Vg3aHOAfDbEP


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. We'll respond in a few minutes. Learn more here.

@@ -67,6 +67,7 @@ def create(
messages: List[ChatCompletionMessageParam],
max_retries: int = 3,
validation_context: dict | None = None,
strict: bool = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parameter 'strict' is not documented. Please add a comment or update the function docstring to explain what this parameter does and how it should be used.

@@ -116,6 +116,7 @@ async def new_create_async(
response_model: Type[T_Model] = None,
validation_context: dict = None,
max_retries: int = 1,
strict: bool = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new parameter 'strict' is not documented. Please add a comment or update the function docstring to explain what this parameter does and how it should be used. This comment is also applicable to other instances of the 'strict' parameter in this file.

Copy link

Deploying instructor with  Cloudflare Pages  Cloudflare Pages

Latest commit: 291e3e5
Status: ✅  Deploy successful!
Preview URL: https://6128e958.instructor.pages.dev
Branch Preview URL: https://allow-strict-in-create.instructor.pages.dev

View logs

@voberoi
Copy link
Contributor

voberoi commented Apr 22, 2024

I think this is a good change and should be merged but it doesn't fulfill my intent behind #612.

#612 is about allowing control characters in JSON strings because this happens so commonly with Claude's models.

Pydantic's model_validate_json(..., strict=False) does not allow control characters in strings, but does all this which might be desirable to clients in some cases.

The standard library's json.loads(... strict=False) does one thing: it allows control characters in JSON strings, which is what I want in #612.

If you want to merge these non-strict semantics, the change looks like this for the JSON-parsing functions in function_calls.py:

    @classmethod
    def parse_anthropic_json(
        cls: Type[BaseModel],
        completion,
        validation_context: Optional[Dict[str, Any]] = None,
        strict: Optional[bool] = None,
    ) -> BaseModel:
        from anthropic.types import Message

        assert isinstance(completion, Message)

        text = completion.content[0].text
        extra_text = extract_json_from_codeblock(text)

        if strict:
            return cls.model_validate_json(
                extra_text, context=validation_context, strict=strict
            )
        else:
            # Allow control characters.
            parsed = json.loads(extra_text, strict=False)
            # Pydantic non-strict: https://docs.pydantic.dev/latest/concepts/strict_mode/
            return cls.model_validate(parsed, context=validation_context, strict=strict)

Maybe you don't want to merge these semantics in instructor's strict, in which case there would need to be two separate arguments to toggle these different capabilities.

If this is functionality you want in instructor I'm happy to submit a PR subject to however you want to design this.

@voberoi
Copy link
Contributor

voberoi commented Apr 22, 2024

This functionality was made possible at some point, not sure when it was removed: #75

@jxnl
Copy link
Collaborator Author

jxnl commented Apr 28, 2024

I think this is a good change and should be merged but it doesn't fulfill my intent behind #612.

#612 is about allowing control characters in JSON strings because this happens so commonly with Claude's models.

Pydantic's model_validate_json(..., strict=False) does not allow control characters in strings, but does all this which might be desirable to clients in some cases.

The standard library's json.loads(... strict=False) does one thing: it allows control characters in JSON strings, which is what I want in #612.

If you want to merge these non-strict semantics, the change looks like this for the JSON-parsing functions in function_calls.py:

    @classmethod
    def parse_anthropic_json(
        cls: Type[BaseModel],
        completion,
        validation_context: Optional[Dict[str, Any]] = None,
        strict: Optional[bool] = None,
    ) -> BaseModel:
        from anthropic.types import Message

        assert isinstance(completion, Message)

        text = completion.content[0].text
        extra_text = extract_json_from_codeblock(text)

        if strict:
            return cls.model_validate_json(
                extra_text, context=validation_context, strict=strict
            )
        else:
            # Allow control characters.
            parsed = json.loads(extra_text, strict=False)
            # Pydantic non-strict: https://docs.pydantic.dev/latest/concepts/strict_mode/
            return cls.model_validate(parsed, context=validation_context, strict=strict)

Maybe you don't want to merge these semantics in instructor's strict, in which case there would need to be two separate arguments to toggle these different capabilities.

If this is functionality you want in instructor I'm happy to submit a PR subject to however you want to design this.

lets allow this too, I'll merge this first. sorry for delay was on vacation!

@jxnl jxnl merged commit 38bd2d9 into main Apr 28, 2024
7 of 13 checks passed
@jxnl jxnl deleted the allow-strict-in-create branch April 28, 2024 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants