-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Concurrent Segment Search] Add support for query profiler with concurrent aggregation #8330
Comments
BackgroundThe search profile
Currently, with the model of concurrent search, the query portion of the profile response in the @Override
public Map<String, Long> toBreakdownMap() {
final Map<String, Long> map = new HashMap<>(buildBreakdownMap(this));
for (final AbstractProfileBreakdown<QueryTimingType> context : contexts.values()) {
for (final Map.Entry<String, Long> entry : buildBreakdownMap(context).entrySet()) {
map.merge(entry.getKey(), entry.getValue(), Long::sum);
}
}
return map;
} The Proposed SolutionIn the concurrent search case:
Sample query responseNon-concurrent search{
...
"profile":
{
"shards":
[
{
"id": "[aggqMTQ4QbOShmjsbcXtrQ][index][0]",
"inbound_network_time_in_millis": 0,
"outbound_network_time_in_millis": 0,
"searches":
[
{
"query":
[
{
"type": "MatchAllDocsQuery",
"description": "*:*",
"time_in_nanos": 382251,
"breakdown":
{
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 6708,
"match": 0,
"next_doc_count": 11,
"score_count": 7,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 1875,
"advance_count": 3,
"score": 2959,
"build_scorer_count": 6,
"create_weight": 102875,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 267834
}
}
],
"rewrite_time": 8501,
"collector": [...]
}
],
"aggregations": [...]
}
]
}
} Concurrent search with 3 segment slices{
...
"profile":
{
"shards":
[
{
"id": "[09K0cflDSwO6kVx95Gj-eA][index][0]",
"inbound_network_time_in_millis": 0,
"outbound_network_time_in_millis": 0,
"searches":
[
{
"query":
[
{
"type": "MatchAllDocsQuery",
"description": "*:*",
"time_in_nanos": 271500,
"max_slice_time_in_nanos": 271500,
"min_slice_time_in_nanos": 232340,
"avg_slice_time_in_nanos": 258877,
"breakdown":
{
"create_weight": 171556,
"create_weight_count": 1,
"build_scorer": 1691375,
"max_build_scorer": 576136,
"min_build_scorer": 551446,
"avg_build_scorer": 563791,
"build_scorer_count": 6,
"max_build_scorer_count": 2,
"min_build_scorer_count": 2,
"avg_build_scorer_count": 2,
"next_doc": 3834,
"max_next_doc": 1334,
"min_next_doc": 1222,
"avg_next_doc": 1278,
"next_doc_count": 8,
"max_next_doc_count": 3,
"min_next_doc_count": 2,
"avg_next_doc_count": 3,
...
}
}
],
"rewrite_time": 10083,
"collector": [...]
}
],
"aggregations": [...]
}
]
}
} Alternatives1. Expand query profile stats across all slices with concurrent executionInstead of exposing Sample query response for a concurrent search with 3 segment slices{
...
"profile":
{
"shards":
[
{
"id": "[09K0cflDSwO6kVx95Gj-eA][index][0]",
"inbound_network_time_in_millis": 0,
"outbound_network_time_in_millis": 0,
"searches":
[
{
"query":
[
{
"type": "MatchAllDocsQuery",
"description": "*:*",
"time_in_nanos": 90500,
"breakdown":
{
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 3834,
"match": 0,
"next_doc_count": 8,
"score_count": 8,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 2958,
"advance_count": 3,
"score": 5459,
"build_scorer_count": 6,
"create_weight": 271500,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 1691375
}
},
{
"type": "MatchAllDocsQuery",
"description": "*:*",
"time_in_nanos": 90500,
"breakdown":
{
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 2334,
"match": 0,
"next_doc_count": 8,
"score_count": 5,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 1587,
"advance_count": 3,
"score": 5125,
"build_scorer_count": 2,
"create_weight": 271500,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 1691234
}
},
{...}
],
"rewrite_time": 10083,
"collector": [...]
}
],
"aggregations": [...]
}
]
}
} 2. Keep the current cumulative summary version of the timing breakdownCurrently, no code changes are required for this approach. However, it is important to note that we will lose the slice-level profile details. Sample query response for a concurrent search with 3 segment slices{
...
"profile":
{
"shards":
[
{
"id": "[OHv1ByGtTHqVf55eR4Bvsg][index][0]",
"inbound_network_time_in_millis": 0,
"outbound_network_time_in_millis": 0,
"searches":
[
{
"query":
[
{
"type": "MatchAllDocsQuery",
"description": "*:*",
"time_in_nanos": 129584,
"breakdown":
{
"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 15418,
"match": 0,
"next_doc_count": 26,
"score_count": 16,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 3042,
"advance_count": 3,
"score": 7876,
"build_scorer_count": 6,
"create_weight": 129584,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 765164
}
}
],
"rewrite_time": 11333,
"collector": [...]
}
],
"aggregations": [...]
}
]
}
} |
@ticheng-aws thanks for the proposal
This is correct, the timing is cumulative. When I originally worked on the feature, the ideal picture for the profiling I had in mind was "perceived time", not avg / min / max. What I mean by that: we know that some slices will be processed concurrently, some will be queued (and hopefully processed concurrently later on). By "perceived time" I envisioned the time between first chunk of slices being picked up and the last chunk of slices being completed. That the time that consumers will see as the search execution. The proposal by itself adds useful information, I think what is missed is the fact there is concurrency involved (so the numbers don't end up, they rather interleave, in general case). |
Placeholder issue to support query profiler with concurrent aggregation flow (both for global and non-global aggregation). This will be a subtask for #7354
The text was updated successfully, but these errors were encountered: