Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation free DataBlockCache lookups #8140

Merged

Conversation

richardstartin
Copy link
Member

@richardstartin richardstartin commented Feb 5, 2022

ColumnTypePair show up just behind int[]/double[] in allocation profiles of query execution.
Screenshot 2022-02-06 at 21 48 53

This changes avoids allocating these by using the DataType as a first level lookup into an EnumMap (a small array behind the scenes) to a Set<String> which is the set of columns for that data type. This reduces allocation rate and improves performance (does not regress) for a range of queries:

master

Benchmark                                                (_intBaseValue)  (_numRows)                                                                                                                                                                                                                                                         (_query)  Mode  Cnt        Score         Error   Units
BenchmarkQueries.query                                                 0     1500000                                                                                                                                                                                                                             SELECT SUM(RAW_INT_COL) FROM MyTable  avgt    5    17463.734 ±     355.827   us/op
BenchmarkQueries.query:·gc.alloc.rate.norm                             0     1500000                                                                                                                                                                                                                             SELECT SUM(RAW_INT_COL) FROM MyTable  avgt    5   608548.162 ± 1016513.369    B/op
BenchmarkQueries.query                                                 0     1500000                                                        SELECT SUM(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999),MAX(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999) FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5    23784.153 ±    1163.643   us/op
BenchmarkQueries.query:·gc.alloc.rate.norm                             0     1500000                                                        SELECT SUM(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999),MAX(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999) FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5   426997.395 ±  420169.858    B/op
BenchmarkQueries.query                                                 0     1500000  SELECT SUM(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_sum,MAX(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_avg FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5    52328.177 ±    2807.010   us/op
BenchmarkQueries.query:·gc.alloc.rate.norm                             0     1500000  SELECT SUM(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_sum,MAX(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_avg FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5   998184.088 ± 1572072.289    B/op

branch

Benchmark                                                (_intBaseValue)  (_numRows)                                                                                                                                                                                                                                                         (_query)  Mode  Cnt       Score         Error   Units
BenchmarkQueries.query                                                 0     1500000                                                                                                                                                                                                                             SELECT SUM(RAW_INT_COL) FROM MyTable  avgt    5   17374.187 ±     334.140   us/op
BenchmarkQueries.query:·gc.alloc.rate.norm                             0     1500000                                                                                                                                                                                                                             SELECT SUM(RAW_INT_COL) FROM MyTable  avgt    5  603245.214 ± 1004990.185    B/op
BenchmarkQueries.query                                                 0     1500000                                                        SELECT SUM(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999),MAX(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999) FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5   23116.649 ±     753.629   us/op
BenchmarkQueries.query:·gc.alloc.rate.norm                             0     1500000                                                        SELECT SUM(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999),MAX(INT_COL) FILTER(WHERE INT_COL > 123 AND INT_COL < 599999) FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5  422940.704 ±  411258.808    B/op
BenchmarkQueries.query                                                 0     1500000  SELECT SUM(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_sum,MAX(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_avg FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5   52473.093 ±    3630.290   us/op
BenchmarkQueries.query:·gc.alloc.rate.norm                             0     1500000  SELECT SUM(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_sum,MAX(CASE WHEN (INT_COL > 123 AND INT_COL < 599999) THEN INT_COL ELSE 0 END) AS total_avg FROM MyTable WHERE NO_INDEX_INT_COL > 5 AND NO_INDEX_INT_COL < 1499999  avgt    5  973148.139 ± 1518881.578    B/op

@codecov-commenter
Copy link

codecov-commenter commented Feb 5, 2022

Codecov Report

Merging #8140 (1ae6e8a) into master (a47af49) will increase coverage by 39.49%.
The diff coverage is 95.12%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master    #8140       +/-   ##
=============================================
+ Coverage     30.69%   70.19%   +39.49%     
- Complexity        0     4302     +4302     
=============================================
  Files          1613     1624       +11     
  Lines         83952    84292      +340     
  Branches      12597    12635       +38     
=============================================
+ Hits          25768    59165    +33397     
+ Misses        55889    21041    -34848     
- Partials       2295     4086     +1791     
Flag Coverage Δ
integration1 ?
integration2 27.66% <82.92%> (-0.05%) ⬇️
unittests1 67.90% <95.12%> (?)
unittests2 14.21% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...a/org/apache/pinot/core/common/DataBlockCache.java 91.42% <95.12%> (+9.19%) ⬆️
...pinot/minion/exception/TaskCancelledException.java 0.00% <0.00%> (-100.00%) ⬇️
...nverttorawindex/ConvertToRawIndexTaskExecutor.java 0.00% <0.00%> (-100.00%) ⬇️
...e/pinot/common/minion/MergeRollupTaskMetadata.java 0.00% <0.00%> (-94.74%) ⬇️
...plugin/segmentuploader/SegmentUploaderDefault.java 0.00% <0.00%> (-87.10%) ⬇️
.../transform/function/MapValueTransformFunction.java 0.00% <0.00%> (-85.30%) ⬇️
...ot/common/messages/RoutingTableRebuildMessage.java 0.00% <0.00%> (-81.82%) ⬇️
...verttorawindex/ConvertToRawIndexTaskGenerator.java 5.45% <0.00%> (-80.00%) ⬇️
...ache/pinot/common/lineage/SegmentLineageUtils.java 22.22% <0.00%> (-77.78%) ⬇️
...ore/startree/executor/StarTreeGroupByExecutor.java 0.00% <0.00%> (-77.78%) ⬇️
... and 1164 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a47af49...1ae6e8a. Read the comment docs.

@@ -109,12 +111,11 @@ public int getNumDocs() {
* @return Array of int values
*/
public int[] getIntValuesForSVColumn(String column) {
ColumnTypePair key = new ColumnTypePair(column, FieldSpec.DataType.INT);
Copy link
Contributor

@siddharthteotia siddharthteotia Feb 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the potential problem being fixed here is the per call creation of ColumnTypePair object and that is leading to some heap/perf overhead ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, about 8MB of these were allocated in 1s in a benchmark which only allocated ~25MB/s. It's one of the main sources of allocation.

@@ -109,12 +111,11 @@ public int getNumDocs() {
* @return Array of int values
*/
public int[] getIntValuesForSVColumn(String column) {
ColumnTypePair key = new ColumnTypePair(column, FieldSpec.DataType.INT);
int[] intValues = (int[]) _valuesMap.get(key);
Copy link
Contributor

@siddharthteotia siddharthteotia Feb 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously there is a single level indirection to get the corresponding values[] for a given ColumnTypePar. Now it's a Map<Type,Map<String,Object>> plus another function call getValues().

So while we avoid the creation of ColumnTypePair object, is it possible that new code will add some perf overhead that will negate any benefit of this PR ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has all been measured, let me put together benchmark results (you can see the combined effect of the set of changes in #8134)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that an EnumMap lookup is an array access by ordinal, and the array is very small, so the cost of indirection here is very low.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@richardstartin
Copy link
Member Author

I will apply Jackie's suggestions before merging, please wait until I've done that (tomorrow)

@richardstartin richardstartin force-pushed the allocation-free-datablock-cache branch from 6321642 to 8cfedf9 Compare February 7, 2022 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants