Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(agents-api): rafactor get and list executions queries to not use latest_executions #1188

Merged
merged 1 commit into from
Feb 26, 2025

Conversation

Ahmad-mtos
Copy link
Contributor

@Ahmad-mtos Ahmad-mtos commented Feb 26, 2025

PR Type

Bug fix, Enhancement


Description

  • Refactored get_execution_query to replace latest_executions with a new query structure.

  • Updated list_executions_query to remove reliance on latest_executions and improve query logic.

  • Introduced detailed status and error handling in both queries.

  • Enhanced query structure with better joins and ordering mechanisms.


Changes walkthrough 📝

Relevant files
Enhancement
get_execution.py
Refactor `get_execution_query` to remove `latest_executions`

agents-api/agents_api/queries/executions/get_execution.py

  • Replaced latest_executions with a new query structure.
  • Added detailed status and error handling logic.
  • Introduced a LEFT JOIN with transitions table for enhanced data
    retrieval.
  • Improved ordering by created_at and updated_at.
  • +37/-3   
    list_executions.py
    Refactor `list_executions_query` to remove `latest_executions`

    agents-api/agents_api/queries/executions/list_executions.py

  • Replaced latest_executions with a new query structure.
  • Added a lateral join for fetching latest transitions.
  • Simplified ordering logic and removed unused sort options.
  • Enhanced query to include detailed status and error handling.
  • +51/-9   

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.

  • EntelligenceAI PR Summary

    Purpose:

    • Enhance the accuracy and detail of execution data retrieval.

    Changes:

    • Enhancement: Updated get_execution query to include additional fields (developer_id, task_id, task_version) and improved status mapping.
    • Enhancement: Modified list_executions query to join with the latest transition data, offering a detailed execution state snapshot and adjusted ordering by created_at.

    Impact:

    • Provides a comprehensive view of execution states, facilitates task tracking, and improves execution data retrieval performance.

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 Security concerns

    SQL Injection:
    While the code uses parameterized queries ($1, $2, etc) which is good practice, the ORDER BY clause in list_executions_query directly uses $3 value in the CASE statement without validation, potentially allowing SQL injection if input validation is not properly handled elsewhere

    ⚡ Recommended focus areas for review

    Incomplete Implementation

    The commented out sorting by updated_at and FIXME comments indicate incomplete functionality that needs to be properly implemented or removed

    # FIXME: order by updated_at as well
    # FIXME: return to latest_executions view once latest_transitions is fixed
    list_executions_query = """
    Parameter Mismatch

    The query parameters ($3, $4, $5) don't match with the commented out sort_by parameter in the function arguments, which could cause parameter binding issues

        CASE WHEN $3 = 'asc' THEN e.created_at END ASC NULLS LAST,
        CASE WHEN $3 = 'desc' THEN e.created_at END DESC NULLS LAST
        -- CASE WHEN $3 = 'updated_at' AND $4 = 'asc' THEN e.updated_at END ASC NULLS LAST,
        -- CASE WHEN $3 = 'updated_at' AND $4 = 'desc' THEN e.updated_at END DESC NULLS LAST
    LIMIT $4 OFFSET $5;
    Status Handling

    The status CASE statement contains complex logic that might benefit from being moved to a separate function or view for better maintainability and reuse

    CASE
        WHEN lt.type::text IS NULL THEN 'queued'
        WHEN lt.type::text = 'init' THEN 'starting'
        WHEN lt.type::text = 'init_branch' THEN 'running'
        WHEN lt.type::text = 'wait' THEN 'awaiting_input'
        WHEN lt.type::text = 'resume' THEN 'running'
        WHEN lt.type::text = 'step' THEN 'running'
        WHEN lt.type::text = 'finish' THEN 'succeeded'
        WHEN lt.type::text = 'finish_branch' THEN 'running'
        WHEN lt.type::text = 'error' THEN 'failed'
        WHEN lt.type::text = 'cancelled' THEN 'cancelled'
        ELSE 'queued'
    END AS status,

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix mismatched SQL parameter order

    The query parameters in the function don't match the SQL query's parameter
    order. The SQL expects $3 for sorting direction, but the code passes sort_by
    (which is commented out) before direction.

    agents-api/agents_api/queries/executions/list_executions.py [124-133]

     return (
         list_executions_query,
         [
             developer_id,
             task_id,
    -        # sort_by,
             direction,
             limit,
             offset,
         ],
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    __

    Why: The suggestion correctly identifies a critical parameter mismatch between SQL query and function arguments that could cause runtime errors or incorrect query execution. The fix properly aligns the parameters with the SQL query's expectations.

    High
    General
    Improve SQL sort logic

    The ORDER BY clause uses $3 for both ASC and DESC conditions which could lead to
    incorrect sorting when direction is neither 'asc' nor 'desc'.

    agents-api/agents_api/queries/executions/list_executions.py [61-63]

     ORDER BY
    -    CASE WHEN $3 = 'asc' THEN e.created_at END ASC NULLS LAST,
    -    CASE WHEN $3 = 'desc' THEN e.created_at END DESC NULLS LAST
    +    e.created_at CASE WHEN $3 = 'asc' THEN ASC ELSE DESC END NULLS LAST
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    __

    Why: The suggestion offers a more robust and cleaner way to handle sorting direction, preventing potential issues when direction parameter has unexpected values. The improved version is more maintainable and less prone to errors.

    Medium
    • More

    Copy link
    Contributor

    Walkthrough

    The recent update focuses on improving the execution queries by refining selection criteria and enhancing status determination logic. The get_execution query now includes additional fields and a more detailed status mapping, while the list_executions query has been modified to join with the latest transition data, offering a comprehensive view of execution states. These changes aim to enhance the accuracy and detail of execution data retrieval.

    Changes

    File(s) Summary
    agents-api/agents_api/queries/executions/get_execution.py Enhanced query to include additional fields (developer_id, task_id, task_version) and refined status mapping. Improved error handling and output data logic.
    agents-api/agents_api/queries/executions/list_executions.py Updated query to join with latest transition data using lateral join, providing detailed status mapping. Adjusted ordering logic to prioritize created_at timestamps.
    Entelligence.ai can learn from your feedback. Simply add 👍 / 👎 emojis to teach it your preferences. More shortcuts below

    Emoji Descriptions:

    • ⚠️ Potential Issue - May require further investigation.
    • 🔒 Security Vulnerability - Fix to ensure system safety.
    • 💻 Code Improvement - Suggestions to enhance code quality.
    • 🔨 Refactor Suggestion - Recommendations for restructuring code.
    • ℹ️ Others - General comments and information.

    Interact with the Bot:

    • Send a message or request using the format:
      @bot + *your message*
    Example: @bot Can you suggest improvements for this code?
    
    • Help the Bot learn by providing feedback on its responses.
      @bot + *feedback*
    Example: @bot Do not comment on `save_auth` function !
    

    Comment on lines +46 to +49
    executions e
    LEFT JOIN transitions lt ON e.execution_id = lt.execution_id
    WHERE e.execution_id = $1
    ORDER BY lt.created_at DESC
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    The ORDER BY lt.created_at DESC with LEFT JOIN can return incorrect status when there are no transitions, since NULL values from lt.created_at will be ordered last. Should use COALESCE(lt.created_at, e.created_at) in ORDER BY.

    📝 Committable Code Suggestion

    ‼️ Ensure you review the code suggestion before committing it to the branch. Make sure it replaces the highlighted code, contains no missing lines, and has no issues with indentation.

    Suggested change
    executions e
    LEFT JOIN transitions lt ON e.execution_id = lt.execution_id
    WHERE e.execution_id = $1
    ORDER BY lt.created_at DESC
    executions e
    LEFT JOIN transitions lt ON e.execution_id = lt.execution_id
    WHERE e.execution_id = $1
    ORDER BY COALESCE(lt.created_at, e.created_at) DESC

    Comment on lines 126 to 132
    [
    developer_id,
    task_id,
    sort_by,
    # sort_by,
    direction,
    limit,
    offset,
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Parameter binding mismatch after removing sort_by - $3 now binds to direction but query assumes it's for sort field comparison, leading to incorrect sorting.

    📝 Committable Code Suggestion

    ‼️ Ensure you review the code suggestion before committing it to the branch. Make sure it replaces the highlighted code, contains no missing lines, and has no issues with indentation.

    Suggested change
    [
    developer_id,
    task_id,
    sort_by,
    # sort_by,
    direction,
    limit,
    offset,
    [
    developer_id,
    task_id,
    direction,
    limit,
    offset,

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ❌ Changes requested. Reviewed everything up to 0678317 in 2 minutes and 29 seconds

    More details
    • Looked at 132 lines of code in 2 files
    • Skipped 0 files when reviewing.
    • Skipped posting 5 drafted comments based on config settings.
    1. agents-api/agents_api/queries/executions/get_execution.py:12
    • Draft comment:
      Consider using SELECT * FROM latest_executions WHERE execution_id = $1 instead of duplicating the status derivation logic.

    • view latest_executions (000013_executions_continuous_view.up.sql)

    • Reason this comment was not posted:
      Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50%
      The suggestion seems reasonable at first since it would reduce code duplication. However, there are subtle differences between the implementations. The query specifically orders by created_at and takes the latest transition, while the view uses a latest_transitions table which may have different semantics. Without understanding the full context of how latest_transitions works vs ordering by created_at, we can't be certain these approaches are equivalent.
      The implementations may have intentionally different semantics - the direct query with ORDER BY could be more reliable or performant than using the view in this specific case.
      While code duplication is generally bad, in this case the slight differences and lack of context about the equivalence of these approaches means we can't be confident the suggestion is correct.
      We should not keep this comment since we don't have enough context to be certain that using the view would be equivalent and correct.

    2. agents-api/agents_api/queries/executions/get_execution.py:47
    • Draft comment:
      Consider using a LEFT JOIN LATERAL (as in the list query) to guarantee only the latest transition per execution is selected. Without it, multiple matching transitions may lead to non-deterministic results.
    • Reason this comment was not posted:
      Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50%
      The comment raises a valid point about query determinism. Both approaches (LEFT JOIN + ORDER BY + LIMIT 1 and LEFT JOIN LATERAL) would work to get the latest transition, but LEFT JOIN LATERAL is more explicit and potentially more performant as it clearly scopes the subquery per execution. However, since we're querying for a single execution_id, the current approach is already deterministic - there's no risk of mixing transitions from different executions.
      Since we're querying by a specific execution_id, the current approach is already deterministic. The suggested change might be a premature optimization.
      While the current approach works correctly, LEFT JOIN LATERAL would make the intent clearer and could be more maintainable if the query is modified in the future.
      The comment suggests a valid alternative approach, but the current implementation is already correct and deterministic for this specific use case.
    3. agents-api/agents_api/queries/executions/list_executions.py:129
    • Draft comment:
      The 'sort_by' parameter is validated and present in the function signature but commented out in the parameters passed to the query. Either remove it or pass it along to ensure consistent behavior.
    • Reason this comment was not posted:
      Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
      The code shows clear signs of an intentional refactoring in progress. The SQL query has been modified to only support created_at sorting, with updated_at support commented out. The FIXME comment indicates this is a known limitation. The commented out sort_by parameter aligns with this temporary simplification. This appears to be an intentional intermediate state rather than an oversight.
      I could be wrong about this being intentional - maybe the developer forgot to update the function signature and validation to match the new query structure. The inconsistency between signature and implementation could cause confusion.
      The presence of the FIXME comment and the carefully commented-out sections in both the SQL and parameters strongly suggests this is an intentional temporary state rather than an oversight.
      The comment should be deleted as it's highlighting an intentional temporary state that's already acknowledged by the FIXME comment, not a bug that needs fixing.
    4. agents-api/agents_api/queries/executions/get_execution.py:47
    • Draft comment:
      Consider using a LATERAL join (as in list_executions) to consistently select the latest transition, reducing ambiguity when multiple transitions exist.
    • Reason this comment was not posted:
      Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50%
      Both approaches (LATERAL join vs ORDER BY + LIMIT) can work to get the latest transition. While LATERAL join might be slightly more performant, the current approach is clear and functional. Without seeing performance metrics or understanding the scale of data, it's hard to say if this optimization would have meaningful impact. The comment is somewhat speculative about potential ambiguity issues.
      I might be underestimating the performance impact of LATERAL joins. There could be edge cases where the current approach leads to race conditions or inconsistencies.
      While those concerns are valid, the current implementation is clear and functional. Without concrete evidence of performance issues or actual ambiguity problems, this feels like premature optimization.
      The comment should be removed as it suggests an alternative implementation without clear evidence that it would be better than the current approach.
    5. agents-api/agents_api/queries/executions/list_executions.py:129
    • Draft comment:
      The 'sort_by' parameter is validated but commented out in the parameter list and not used in the ORDER BY clause. This ignores ordering by updated_at.
    • Reason this comment was not posted:
      Marked as duplicate.

    Workflow ID: wflow_eFDXgc75TetqN6yG


    Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

    developer_id = $1 AND
    task_id = $2
    e.developer_id = $1 AND
    e.task_id = $2
    ORDER BY
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    The ORDER BY only handles 'created_at' based on direction, ignoring the validated 'sort_by' (which includes 'updated_at'). Update the ordering logic or remove 'sort_by' if not needed.

    @Ahmad-mtos Ahmad-mtos merged commit 5c95a4c into dev Feb 26, 2025
    14 checks passed
    @Ahmad-mtos Ahmad-mtos deleted the x/executions-status branch February 26, 2025 13:05
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant