Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify precomputation of aggregations behind a common API #16733

Merged
merged 5 commits into from
Jan 30, 2025

Conversation

msfroh
Copy link
Collaborator

@msfroh msfroh commented Nov 27, 2024

Description

We've had a series of aggregation speedups that use the same strategy: instead of iterating through documents that match the query one-by-one, we can look at a Lucene segment and compute the aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic that hijacks the getLeafCollector method and throws CollectionTerminatedException. This creates the illusion that we're implementing a custom LeafCollector, when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is moved into AggregatorBase. Aggregators that have a strategy to precompute their answer can override tryPrecomputeAggregationForLeaf, which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations have precomputation approaches (since they override this method).

Related Issues

N/A

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 4d5c32b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419
Copy link
Contributor

Regarding implementation of this, I have one more alternative which I think is worth discussing. How about bringing this abstraction at ContextIndexSearcher itself.

            weight = wrapWeight(weight);
            // See please https://github.com/apache/lucene/pull/964
            collector.setWeight(weight);
            leafCollector = collector.getLeafCollector(ctx);

Basically if we have pre computed aggregations already, we assign it as EarlyTerminationCollector.

So, what I'm thinking about is cases with sub-aggregations that we can pre-compute, which is highly relevant in cases of star tree pre-computation. For eg.: #16674 and if a dedicated abstraction for star-tree preCompute in ComtextIndexSearcher wopuld make more sense or not.

Copy link
Contributor

✅ Gradle check result for 4d5c32b: SUCCESS

Copy link

codecov bot commented Dec 12, 2024

Codecov Report

Attention: Patch coverage is 82.05128% with 14 lines in your changes missing coverage. Please review.

Project coverage is 72.32%. Comparing base (cd149a9) to head (1c3c990).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...rch/search/aggregations/metrics/MinAggregator.java 60.00% 2 Missing and 2 partials ⚠️
...ket/terms/GlobalOrdinalsStringTermsAggregator.java 84.21% 2 Missing and 1 partial ⚠️
...rch/aggregations/metrics/ValueCountAggregator.java 66.66% 2 Missing and 1 partial ⚠️
...rch/search/aggregations/metrics/MaxAggregator.java 80.00% 1 Missing and 1 partial ⚠️
...rch/search/aggregations/metrics/AvgAggregator.java 85.71% 1 Missing ⚠️
...rch/search/aggregations/metrics/SumAggregator.java 87.50% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16733      +/-   ##
============================================
- Coverage     72.41%   72.32%   -0.09%     
- Complexity    65626    65712      +86     
============================================
  Files          5306     5319      +13     
  Lines        304927   305722     +795     
  Branches      44257    44348      +91     
============================================
+ Hits         220804   221107     +303     
- Misses        66007    66573     +566     
+ Partials      18116    18042      -74     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@msfroh
Copy link
Collaborator Author

msfroh commented Jan 10, 2025

@jainankitk -- you're probably the maintainer (other than me) with the most context into this change. What do you think?

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

Signed-off-by: Michael Froh <[email protected]>
Not sure why I added this, when the existing implementation didn't have it.

That said, we *should* call finishLeaf() before precomputing the current leaf.

Signed-off-by: Michael Froh <[email protected]>
@msfroh msfroh force-pushed the agg_precomputation_API branch from c3897a0 to 19a40cc Compare January 29, 2025 20:06
@msfroh
Copy link
Collaborator Author

msfroh commented Jan 29, 2025

@expani, @sandeshkr419 -- I resolved conflicts with your recent star-tree changes. Can you please take a look?

Copy link
Contributor

❌ Gradle check result for 19a40cc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sandeshkr419
Copy link
Contributor

One high level class I see missing among the metric aggregators is AvgAggregator.java which has similar pre-computations involved.

@msfroh msfroh force-pushed the agg_precomputation_API branch from 4ac8bcb to caceb62 Compare January 29, 2025 23:13
Copy link
Contributor

✅ Gradle check result for caceb62: SUCCESS

Copy link
Contributor

✅ Gradle check result for 1c3c990: SUCCESS

@jainankitk jainankitk merged commit 2847695 into opensearch-project:main Jan 30, 2025
30 checks passed
@jainankitk jainankitk added the backport 2.x Backport to 2.x branch label Jan 30, 2025
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 30, 2025
* Unify precomputation of aggregations behind a common API

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

Signed-off-by: Michael Froh <[email protected]>

* Remove subaggregator check from CompositeAggregator

Not sure why I added this, when the existing implementation didn't have it.

That said, we *should* call finishLeaf() before precomputing the current leaf.

Signed-off-by: Michael Froh <[email protected]>

* Resolve conflicts with star-tree changes

Signed-off-by: Michael Froh <[email protected]>

* Skip precomputation when valuesSource is null

Signed-off-by: Michael Froh <[email protected]>

* Add comment as suggested by @bowenlan-amzn

Signed-off-by: Michael Froh <[email protected]>

---------

Signed-off-by: Michael Froh <[email protected]>
(cherry picked from commit 2847695)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
msfroh pushed a commit that referenced this pull request Jan 30, 2025
…7197)

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

---------


(cherry picked from commit 2847695)

Signed-off-by: Michael Froh <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@sandeshkr419
Copy link
Contributor

sandeshkr419 commented Jan 30, 2025

@msfroh Since this change is not a feature update, should we create a backport 2.19 as well?

One major advantage to backport in 2.19 I see is that any critical bugs if we have to backport to 2.19 in future, can be easily backported to 2.19 without having to worry about making too many manual changes. Thoughts?

cc - @rishabh6788 (2.19 Release Manager)

@msfroh
Copy link
Collaborator Author

msfroh commented Jan 30, 2025

@msfroh Since this change is not a feature update, should we create a backport 2.19 as well?

One major advantage to backport in 2.19 I see is that any critical bugs if we have to backport to 2.19 in future, can be easily backported to 2.19 without having to worry about making too many manual changes. Thoughts?

cc - @rishabh6788 (2.19 Release Manager)

That's a good question. Part of me says, "Well, I missed the 2.19 cut-off, so too bad". On the other hand, your argument about avoiding merge conflicts is also relevant. I'll defer to @rishabh6788's judgement.

opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 30, 2025
* Unify precomputation of aggregations behind a common API

We've had a series of aggregation speedups that use the same strategy:
instead of iterating through documents that match the query
one-by-one, we can look at a Lucene segment and compute the
aggregation directly (if some particular conditions are met).

In every case, we've hooked that into custom logic hijacks the
getLeafCollector method and throws CollectionTerminatedException. This
creates the illusion that we're implementing a custom LeafCollector,
when really we're not collecting at all (which is the whole point).

With this refactoring, the mechanism (hijacking getLeafCollector) is
moved into AggregatorBase. Aggregators that have a strategy to
precompute their answer can override tryPrecomputeAggregationForLeaf,
which is expected to return true if they managed to precompute.

This should also make it easier to keep track of which aggregations
have precomputation approaches (since they override this method).

Signed-off-by: Michael Froh <[email protected]>

* Remove subaggregator check from CompositeAggregator

Not sure why I added this, when the existing implementation didn't have it.

That said, we *should* call finishLeaf() before precomputing the current leaf.

Signed-off-by: Michael Froh <[email protected]>

* Resolve conflicts with star-tree changes

Signed-off-by: Michael Froh <[email protected]>

* Skip precomputation when valuesSource is null

Signed-off-by: Michael Froh <[email protected]>

* Add comment as suggested by @bowenlan-amzn

Signed-off-by: Michael Froh <[email protected]>

---------

Signed-off-by: Michael Froh <[email protected]>
(cherry picked from commit 2847695)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@sandeshkr419
Copy link
Contributor

sandeshkr419 commented Jan 30, 2025

Discussed with @rishabh6788 offline. We are in consensus to include this for the fore-mentioned reason. Adding up backport 2.19 label for the bot to create a backport PR.

@sandeshkr419 sandeshkr419 added the v2.19.0 Issues and PRs related to version 2.19.0 label Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport 2.19 skip-changelog v2.19.0 Issues and PRs related to version 2.19.0
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants