Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix filtered aggregation when it is mixed with regular aggregation #8172

Conversation

Jackie-Jiang
Copy link
Contributor

When a query has the same mixed regular aggregation and filtered aggregation, the index of the aggregation function is not maintained correctly.
This PR contains the following changes:

  • Fix QueryContext to maintain the correct index
  • Fix PostAggregationHandler to always use the filtered aggregation index map
  • Add tests for QueryContext and queries

@Jackie-Jiang Jackie-Jiang requested a review from atris February 8, 2022 23:14
Copy link
Contributor

@atris atris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

IMO, the fix is pretty contained and can be done in 50ish LOC. The majority of the changes in the PR are around whitespace and renaming, which are a separate discussion and should be done in a separate PR.

} else if (function.getType() == FunctionContext.Type.TRANSFORM
&& function.getFunctionName().equalsIgnoreCase("filter")) {
return new ColumnValueExtractor(
_filteredAggregationsIndexMap.get(Pair.of(function, null)) + _numGroupByExpressions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main change here is that regular aggregations also refer to the filtered aggregation functions index map, instead of the standard index map. The remaining changes seem whitespace/naming changes?

Assert.assertEquals(firstSetRow[j], secondSetRow[j]);
}
private void testQuery(String filterQuery, String nonFilterQuery) {
List<Object[]> filterQueryResults = getBrokerResponseForSqlQuery(filterQuery).getResultTable().getRows();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this method has the same intent as the original method -- the original method had extra checks around row lengths and number of rows, and the result schemas. Can we please revert this? This anyways does not seem related to the problem the PR is solving.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the data schema check in #8184
The rows check remains the same, where assertEquals(Object[] a, Object[] b) will verify the array size and array content

@Jackie-Jiang
Copy link
Contributor Author

Jackie-Jiang commented Feb 9, 2022

@atris Most of the whitespace changes are actually done by the IDE auto-formatting. Let me separate them into a separate PR so that the change for the bug fix is more clear.

The main fix is in 2 parts:

  • Index calculation in QueryContext
  • Index lookup in PostAggregationHandler

@Jackie-Jiang Jackie-Jiang force-pushed the fix_filtered_aggregation_query_context branch 3 times, most recently from 6244553 to f7e5b0e Compare February 10, 2022 20:54
@Jackie-Jiang
Copy link
Contributor Author

@atris Rebased the re-factor PR, and addressed the comments. Can you please take another look?

@codecov-commenter
Copy link

codecov-commenter commented Feb 10, 2022

Codecov Report

Merging #8172 (d75e78e) into master (b404e4a) will decrease coverage by 43.73%.
The diff coverage is 77.77%.

❗ Current head d75e78e differs from pull request most recent head f7e5b0e. Consider uploading reports for the commit f7e5b0e to get more accurate results

Impacted file tree graph

@@              Coverage Diff              @@
##             master    #8172       +/-   ##
=============================================
- Coverage     71.33%   27.60%   -43.74%     
=============================================
  Files          1623     1612       -11     
  Lines         84314    83952      -362     
  Branches      12640    12600       -40     
=============================================
- Hits          60146    23174    -36972     
- Misses        20047    58620    +38573     
+ Partials       4121     2158     -1963     
Flag Coverage Δ
integration1 ?
integration2 27.60% <77.77%> (+0.13%) ⬆️
unittests1 ?
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...rg/apache/pinot/core/plan/AggregationPlanNode.java 51.00% <0.00%> (-39.91%) ⬇️
...inot/core/query/reduce/PostAggregationHandler.java 84.70% <22.22%> (-7.16%) ⬇️
...pinot/core/query/request/context/QueryContext.java 92.38% <93.02%> (-5.20%) ⬇️
.../java/org/apache/pinot/spi/utils/BooleanUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pinot/spi/config/table/FSTType.java 0.00% <0.00%> (-100.00%) ⬇️
...ava/org/apache/pinot/spi/data/MetricFieldSpec.java 0.00% <0.00%> (-100.00%) ⬇️
...va/org/apache/pinot/spi/utils/BigDecimalUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/common/tier/TierFactory.java 0.00% <0.00%> (-100.00%) ⬇️
...a/org/apache/pinot/spi/config/table/TableType.java 0.00% <0.00%> (-100.00%) ⬇️
.../org/apache/pinot/spi/data/DimensionFieldSpec.java 0.00% <0.00%> (-100.00%) ⬇️
... and 1176 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a899110...f7e5b0e. Read the comment docs.

Copy link
Contributor

@atris atris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please address the one comment remaining

}

@Test
public void testMixedAggregations() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rename this to testMixedAggregationsOfSameType, since other tests in this class contain tests with mixed aggregations, just of different types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Jackie-Jiang Jackie-Jiang force-pushed the fix_filtered_aggregation_query_context branch from f7e5b0e to abb28f1 Compare February 15, 2022 17:34
@Jackie-Jiang Jackie-Jiang merged commit 8042408 into apache:master Feb 15, 2022
@Jackie-Jiang Jackie-Jiang deleted the fix_filtered_aggregation_query_context branch February 15, 2022 19:50
xiangfu0 pushed a commit to xiangfu0/pinot that referenced this pull request Feb 23, 2022
…pache#8172)

When a query has the same mixed regular aggregation and filtered aggregation, the index of the aggregation function is not maintained correctly.
This PR contains the following changes:
- Fix `QueryContext` to maintain the correct index
- Fix `PostAggregationHandler` to always use the filtered aggregation index map
- Add tests for `QueryContext` and queries
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants