Skip to content

Commit

Permalink
ESQL: Speed up VALUES for many buckets (elastic#123073)
Browse files Browse the repository at this point in the history
Speeds up the VALUES agg when collecting from many buckets.
Specifically, this speeds up the algorithm used to `finish` the
aggregation. Most specifically, this makes the algorithm more tollerant
to large numbers of groups being collected. The old algorithm was
`O(n^2)` with the number of groups. The new one is `O(n)`

```
(groups)
      1     219.683 ±    1.069  ->   223.477 ±    1.990 ms/op
   1000     426.323 ±   75.963  ->   463.670 ±    7.275 ms/op
 100000   36690.871 ± 4656.350  ->  7800.332 ± 2775.869 ms/op
 200000   89422.113 ± 2972.606  -> 21920.288 ± 3427.962 ms/op
 400000 timed out at 10 minutes -> 40051.524 ± 2011.706 ms/op
```

The `1` group version was not changed at all. That's just noise in the
measurement. The small bump in the `1000` case is almost certainly worth
it and real. The huge drop in the `100000` case is quite real.
  • Loading branch information
nik9000 committed Feb 23, 2025
1 parent b7da5d9 commit af006bd
Show file tree
Hide file tree
Showing 8 changed files with 698 additions and 183 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@
import java.util.stream.LongStream;
import java.util.stream.Stream;

/**
* Benchmark for many different kinds of aggregator and groupings.
*/
@Warmup(iterations = 5)
@Measurement(iterations = 7)
@BenchmarkMode(Mode.AverageTime)
Expand Down
5 changes: 5 additions & 0 deletions docs/changelog/123073.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 123073
summary: Speed up VALUES for many buckets
area: ES|QL
type: bug
issues: []

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit af006bd

Please sign in to comment.