Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Refactor reasking logic #1071

Merged
merged 13 commits into from
Oct 12, 2024
Merged

fix: Refactor reasking logic #1071

merged 13 commits into from
Oct 12, 2024

Conversation

jxnl
Copy link
Collaborator

@jxnl jxnl commented Oct 12, 2024

Important

Refactor reasking and retry logic by introducing reask.py for mode-specific handling and updating retry.py for improved exception handling and modularity.

  • Reask Logic:
    • Introduced reask.py with functions like reask_anthropic_tools, reask_cohere_tools, reask_gemini_tools for mode-specific reasking.
    • Added handle_reask_kwargs() to select appropriate reask function based on mode.
  • Retry Logic:
    • Refactored retry_sync() and retry_async() in retry.py to use handle_reask_kwargs() for exception handling.
    • Added initialize_retrying() and initialize_usage() to set up retrying and usage tracking.
  • Exceptions:
    • Updated InstructorRetryException in exceptions.py to include create_kwargs.
  • Misc:
    • Minor changes in patch.py and templating.py for logging and templating adjustments.
    • Fixed JSON extraction logic in iterable.py and partial.py.

This description was created by Ellipsis for 28f7513. It will automatically update as commits are pushed.

@jxnl jxnl requested a review from ivanleomk October 12, 2024 02:59
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to d56bc5e in 27 seconds

More details
  • Looked at 800 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 10 drafted comments based on config settings.
1. instructor/reask.py:24
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
2. instructor/reask.py:71
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
3. instructor/reask.py:153
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
4. instructor/reask.py:169
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
5. instructor/reask.py:192
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
6. instructor/reask.py:214
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
7. instructor/reask.py:236
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
8. instructor/reask.py:253
  • Draft comment:
    Unnecessary copying of kwargs. Consider modifying kwargs directly unless there's a specific reason to keep the original unchanged. This comment applies to other similar instances in this file.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code in reask.py and retry.py has several instances where kwargs is copied unnecessarily. This can lead to performance issues due to unnecessary memory usage.
9. instructor/reask.py:27
  • Draft comment:
    Assertions should always have an error message. Please provide a descriptive error message for this assertion.
assert isinstance(response, Message), "Response must be a valid Anthropic Message instance."
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The code has multiple instances where assertions lack error messages. Assertions should always have a clear error message to aid in debugging.
10. instructor/reask.py:74
  • Draft comment:
    Assertions should always have an error message. Please provide a descriptive error message for this assertion.
assert isinstance(response, Message), "Response must be a valid Anthropic Message instance."
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The code has multiple instances where assertions lack error messages. Assertions should always have a clear error message to aid in debugging.

Workflow ID: wflow_1PWK3csLQP2DGuCQ


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 15a5cb8 in 15 seconds

More details
  • Looked at 60 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. instructor/retry.py:99
  • Draft comment:
    Good use of Python 3.10+ syntax with type[T_Model] | None. Ensure this is consistent across the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The change from Optional[Type] to Type | None is a good update for Python 3.10+ syntax, but it should be consistent across the file.
2. instructor/retry.py:178
  • Draft comment:
    Good use of Python 3.10+ syntax with type[T_Model] | None. Ensure this is consistent across the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The change from Optional[Type] to Type | None is a good update for Python 3.10+ syntax, but it should be consistent across the file.
3. instructor/retry.py:97
  • Draft comment:
    Function names retry_sync and retry_async should follow a consistent naming pattern, such as initialize_retrying and initialize_usage. Consider renaming them to initialize_retry_sync and initialize_retry_async.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The function names retry_sync and retry_async are not following a consistent naming pattern with other functions in the file, such as initialize_retrying and initialize_usage.

Workflow ID: wflow_MJAqbw421Q7boybS


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 1f3e3fe in 12 seconds

More details
  • Looked at 15 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. instructor/reask.py:88
  • Draft comment:
    The attempt_number parameter was moved to the end and given a default value. Ensure that this change does not affect any existing calls to reask_cohere_tools that might have used positional arguments.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The attempt_number parameter in reask_cohere_tools is set to a default value of 1, which is a good practice for optional parameters. However, the placement of this parameter was changed, which might affect the function's usage if it was called with positional arguments.
2. instructor/reask.py:84
  • Draft comment:
    Function names should follow a consistent pattern. Consider renaming functions to maintain consistency, such as reask_<platform>_<type> or reask_<platform>. This applies to other functions in this file as well.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The function names in this file are inconsistent. Some use the pattern reask_<platform>_<type> while others use reask_<platform>. This inconsistency can lead to confusion.

Workflow ID: wflow_xr3WKTLZVoLC94HA


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

cloudflare-workers-and-pages bot commented Oct 12, 2024

Deploying instructor-py with  Cloudflare Pages  Cloudflare Pages

Latest commit: 28f7513
Status: ✅  Deploy successful!
Preview URL: https://3e4407d0.instructor-py.pages.dev
Branch Preview URL: https://refactor-reask.instructor-py.pages.dev

View logs

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 8f789a4 in 18 seconds

More details
  • Looked at 68 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. instructor/retry.py:102
  • Draft comment:
    The addition of the context parameter is consistent with the refactoring goals. Ensure that process_response and process_response_async handle this parameter correctly.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new parameter context in the retry_sync and retry_async functions, which is used in the process_response and process_response_async functions. This change aligns with the refactoring described in the PR description.
2. instructor/retry.py:155
  • Draft comment:
    Ensure handle_reask_kwargs is correctly handling the new context parameter if applicable. This is crucial for the refactored reasking logic.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The handle_reask_kwargs function is used in both retry_sync and retry_async functions. The refactoring seems to focus on improving exception handling and modularity, which is consistent with the use of this function.
3. instructor/retry.py:228
  • Draft comment:
    The addition of the context parameter in retry_async is consistent with the refactoring goals. Ensure that process_response_async handles this parameter correctly.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The retry_sync and retry_async functions have been refactored to include the context parameter, which is used in the process_response and process_response_async functions. This aligns with the refactoring goals mentioned in the PR description.
4. instructor/retry.py:179
  • Draft comment:
    The error message in the InstructorRetryException should be formatted for clarity. Consider using a consistent format for error messages throughout the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The error messages in assertions should be formatted well for clarity and consistency.
5. instructor/retry.py:225
  • Draft comment:
    The error message in the InstructorRetryException should be formatted for clarity. Consider using a consistent format for error messages throughout the codebase.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The error messages in assertions should be formatted well for clarity and consistency.

Workflow ID: wflow_3WHaseC8h2heyY7f


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on d2c2f06 in 12 seconds

More details
  • Looked at 10 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. instructor/reask.py:288
  • Draft comment:
    Copying kwargs is unnecessary if not modified outside the function. Consider removing kwargs = kwargs.copy() for performance improvement. This applies to other functions as well.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The code has a consistent pattern of copying the kwargs dictionary at the start of each function. This is unnecessary if the kwargs are not modified in a way that affects the original dictionary outside the function. It can be removed to improve performance.
2. instructor/reask.py:288
  • Draft comment:
    Assertions should always have an error message that is formatted well. Ensure that all assertions in the codebase, including those in other functions, have descriptive error messages.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The code has multiple instances of assertions without error messages.

Workflow ID: wflow_m6pkTtz2erkQZTdh


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on d8d521f in 21 seconds

More details
  • Looked at 36 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. instructor/reask.py:92
  • Draft comment:
    Consider using kwargs.setdefault("chat_history", []).append(...) to simplify the code and avoid unnecessary reassignment.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The code in reask.py and templating.py has some issues that need to be addressed. In reask.py, the reask_cohere_tools function has a potential issue with handling chat_history. In templating.py, the handle_templating function has a logical error in handling new_kwargs when it is a list.
2. instructor/templating.py:91
  • Draft comment:
    Return an empty dictionary instead of None to maintain consistency in return type.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The handle_templating function in templating.py has a logical error when new_kwargs is a list. The function should return an empty dictionary instead of None to maintain consistency in return type.
3. instructor/templating.py:83
  • Draft comment:
    Function names should follow a consistent naming pattern across the codebase. Consider using snake_case for all function names for consistency.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The function reask_cohere_tools in reask.py and handle_templating in templating.py have inconsistent naming conventions. The former uses snake_case while the latter uses camelCase. Consistent naming conventions should be followed across the codebase.
4. instructor/reask.py:91
  • Draft comment:
    Good refactoring to avoid repetition by using get and append methods for chat_history. This adheres to the DRY principle.
  • Reason this comment was not posted:
    Confidence changes required: 0%
    The code in reask_cohere_tools was refactored to avoid repetition by using get and append methods. This change adheres to the DRY principle.

Workflow ID: wflow_HivP1NIYTqEGVJ7f


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on f7a6659 in 23 seconds

More details
  • Looked at 60 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. instructor/templating.py:104
  • Draft comment:
    The handle_templating function should return a consistent type. Currently, it may return None if messages is not found, which can lead to unexpected behavior. Consider returning new_kwargs instead.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The handle_templating function should return a consistent type. Currently, it may return None in some cases, which can lead to unexpected behavior.
2. instructor/templating.py:13
  • Draft comment:
    The process_message function returns different types based on conditions (e.g., gm.Content or None). Consider having a consistent return type or document the behavior clearly.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The function process_message is returning different types based on conditions, which can be confusing. It should have a consistent return type or be documented if it intentionally returns different types.
3. instructor/templating.py:112
  • Draft comment:
    The process_message function is used here, but its return value is not consistently utilized across different usages. Ensure consistent handling of its return value or document the intended behavior.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The process_message function is used in multiple places, and its return value is being used in some cases but not others. This inconsistency should be addressed or documented.

Workflow ID: wflow_7dlNMj6YYMRMWGWO


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@@ -48,7 +53,7 @@ def process_message(message: dict[str, Any], context: dict[str, Any]) -> None:
apply_template(part, context) if isinstance(part, str) else part
for part in message["parts"]
]
return
return message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function process_message is expected to return None, but it returns a value in some cases (e.g., for Gemini format). This inconsistency can lead to unexpected behavior. Consider standardizing the return type.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 6f4f069 in 15 seconds

More details
  • Looked at 27 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. instructor/dsl/iterable.py:95
  • Draft comment:
    Avoid using # type:ignore without a clear reason. Consider addressing the underlying type issue instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The type:ignore comment is used to suppress type checking errors, but it should be used sparingly and with a clear reason. In this case, it seems to be used to ignore a potential type mismatch, but it's not clear why it's necessary.
2. instructor/dsl/partial.py:166
  • Draft comment:
    Avoid using # type:ignore without a clear reason. Consider addressing the underlying type issue instead.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The type:ignore comment is used to suppress type checking errors, but it should be used sparingly and with a clear reason. In this case, it seems to be used to ignore a potential type mismatch, but it's not clear why it's necessary.
3. instructor/dsl/iterable.py:93
  • Draft comment:
    Add a comment explaining why # type:ignore is necessary here and in partial.py on line 166 to improve code clarity.
  • Reason this comment was not posted:
    Confidence changes required: 60%
    The # type:ignore comments are used without explanation, which can be confusing for future developers.

Workflow ID: wflow_P6ZhQft4WUHu24Tp


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ivanleomk
Copy link
Collaborator

ivanleomk commented Oct 12, 2024

Vertex AI Tests

tests/llm/test_vertexai/test_format.py ....                                                                                        [ 14%]
tests/llm/test_vertexai/test_message_parser.py ....                                                                                [ 28%]
tests/llm/test_vertexai/test_modes.py ......                                                                                       [ 50%]
tests/llm/test_vertexai/test_retries.py ....                                                                                       [ 64%]
tests/llm/test_vertexai/test_simple_types.py ......                                                                                [ 85%]
tests/llm/test_vertexai/test_stream.py ....                                                                                        [100%]

Gemini Tests

tests/llm/test_gemini/evals/test_classification_enums.py ..........                                                                [ 15%]
tests/llm/test_gemini/evals/test_classification_literals.py ..........                                                             [ 31%]
tests/llm/test_gemini/evals/test_entities.py ..                                                                                    [ 34%]
tests/llm/test_gemini/evals/test_extract_users.py ......                                                                           [ 44%]
tests/llm/test_gemini/evals/test_sentiment_analysis.py ......                                                                      [ 53%]
tests/llm/test_gemini/test_format.py ....                                                                                          [ 60%]
tests/llm/test_gemini/test_list_content.py .                                                                                       [ 61%]
tests/llm/test_gemini/test_modes.py ....                                                                                           [ 68%]
tests/llm/test_gemini/test_multimodal_content.py ..                                                                                [ 71%]
tests/llm/test_gemini/test_patch.py ....                                                                                           [ 77%]
tests/llm/test_gemini/test_retries.py ....                                                                                         [ 84%]
tests/llm/test_gemini/test_roles.py .                                                                                              [ 85%]
tests/llm/test_gemini/test_simple_types.py ...                                                                                     [ 90%]
tests/llm/test_gemini/test_stream.py ......                                                                                        [100%]

Cerebras Tests

tests/llm/test_cerebras/modes.py ...........                                                                                       [100%]

Cohere Tests

tests/llm/test_cohere/test_json_schema.py ..........                             [ 62%]
tests/llm/test_cohere/test_none_response.py ..                                   [ 75%]
tests/llm/test_cohere/test_retries.py ....                                       [100%

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 28f7513 in 13 seconds

More details
  • Looked at 40 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. instructor/reask.py:278
  • Draft comment:
    Remove redundant comment for clarity.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The comment on line 278 is redundant and can be removed for clarity.
2. instructor/reask.py:278
  • Draft comment:
    The comment can be more concise:
        Mode.COHERE_JSON_SCHEMA: reask_cohere_tools,  # Same function
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The comment on line 278 is not concise and can be improved by removing unnecessary words.

Workflow ID: wflow_weNtQAQRc4Yfn0dC


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@ivanleomk ivanleomk merged commit 8ee9c5d into main Oct 12, 2024
14 of 15 checks passed
@ivanleomk ivanleomk deleted the refactor-reask branch October 12, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants