Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix explain action on query rewrite #17277

Closed
wants to merge 2 commits into from

Conversation

fen-qin
Copy link

@fen-qin fen-qin commented Feb 6, 2025

Description

There is an exception when call explain in "by_doc_id" mode. Response looks like this:

{
    "error": {
        "root_cause": [
            {
                "type": "query_shard_exception",
                "reason": "failed to create query: async actions are left after rewrite",
                "index": "my-nlp-index-1",
                "index_uuid": "I6_pzGG3QBWw0B6KeOJ1pA"
            }
        ],
        "type": "query_shard_exception",
        "reason": "failed to create query: async actions are left after rewrite",
        "index": "my-nlp-index-1",
        "index_uuid": "I6_pzGG3QBWw0B6KeOJ1pA",
        "caused_by": {
            "type": "illegal_state_exception",
            "reason": "async actions are left after rewrite"
        }
    },
    "status": 400
}

The error were caught from TransportExplainAction -> Rewriteable.java because there still be asyncActions left when the flag assertNoAsyncTasks returned as true.

Screenshot 2025-01-29 at 8 29 43 PM

The conflict is down to:

  1. The NeuralQueryBuilder doRewrite method introduces asynchronous actions via queryRewriteContext.registerAsyncAction. So it then returns a context with potentially unresolved async tasks.

  2. Rewritable.rewrite then checksassertNoAsyncTasks flag, throws the exception if there exists any async tasks left.

Fix in this PR is resolve this conflict:

  • by moving the query rewrite to coordinator - TransportExplainAction.
  • verify the changes from local
    • Before fix
    • Screenshot 2025-01-30 at 9 30 29 PM
    • After fix
    • Screenshot 2025-01-30 at 9 24 10 PM

Related Issues

Resolves #1126

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -101,7 +104,21 @@ public TransportExplainAction(
@Override
protected void doExecute(Task task, ExplainRequest request, ActionListener<ExplainResponse> listener) {
request.nowInMillis = System.currentTimeMillis();
super.doExecute(task, request, listener);
// super.doExecute(task, request, listener);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to remove this line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will remove this line in next revision

super.doExecute(task, request, listener);
// super.doExecute(task, request, listener);
LongSupplier timeProvider = () -> request.nowInMillis;
ActionListener<QueryBuilder> rewriteListener = ActionListener.wrap(rewrittenQuery -> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we simply switch from rewrite to rewriteAndFetch since now know this process can have async actions? Now basically we are doing rewriteAndFetch here and later I think we will still will do rewrite again.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is Rewriteable javadoc - https://artifacts.elastic.co/javadoc/org/elasticsearch/elasticsearch/8.17.1/org.elasticsearch.server/org/elasticsearch/index/query/Rewriteable.html

  • rewrite -
    Rewrites each element of the list until it doesn't change and returns a new list iff there is at least one element of the list that changed during it's rewrite.

  • rewriteAndFetch -
    Rewrites the given rewriteable and fetches pending async tasks for each round before rewriting again.

  • so we would need rewrite to be executed here

Copy link
Contributor

github-actions bot commented Feb 6, 2025

❌ Gradle check result for b1f53c0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

github-actions bot commented Feb 6, 2025

❌ Gradle check result for 11ebc0d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@fen-qin fen-qin closed this Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add instrumentation for inode usage by Opensearch process
2 participants