Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistage] support HAVING clause #9274

Merged
merged 4 commits into from
Sep 1, 2022

Conversation

walterddr
Copy link
Contributor

@walterddr walterddr commented Aug 24, 2022

Add support for HAVING clause for basic operators (=. !=, <, <=, >, >=)

  • adding FilterNode for intermediate stage
  • adding FilterOperator and operands since predicates are not actually functions.

TODO:
since intermediate stage treats all operands with data type. it is not suitable to convert it back to literal strings like the FilterContext is doing in the pinot server leaf stage. Thus we need to unify

  1. the Predicate/FilterContext and the intermediate stage FilterOperator/Operands
  2. the TransformFunction vs the PostAggregate ScalarFunction/FunctionInvoker

@codecov-commenter
Copy link

codecov-commenter commented Aug 24, 2022

Codecov Report

Merging #9274 (eac41f8) into master (718f41f) will increase coverage by 0.98%.
The diff coverage is 81.06%.

@@             Coverage Diff              @@
##             master    #9274      +/-   ##
============================================
+ Coverage     68.75%   69.74%   +0.98%     
- Complexity     4755     5046     +291     
============================================
  Files          1859     1873      +14     
  Lines         99129    99721     +592     
  Branches      15077    15167      +90     
============================================
+ Hits          68161    69546    +1385     
+ Misses        26076    25237     -839     
- Partials       4892     4938      +46     
Flag Coverage Δ
integration1 26.08% <0.00%> (-0.34%) ⬇️
integration2 24.81% <0.00%> (?)
unittests1 67.10% <81.06%> (+0.01%) ⬆️
unittests2 15.34% <81.06%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...e/pinot/query/runtime/operator/FilterOperator.java 64.00% <64.00%> (ø)
...ry/runtime/operator/operands/TransformOperand.java 70.00% <70.00%> (ø)
...query/runtime/operator/operands/FilterOperand.java 76.78% <76.78%> (ø)
...ot/query/runtime/executor/WorkerQueryExecutor.java 94.73% <100.00%> (+3.82%) ⬆️
...inot/query/runtime/operator/TransformOperator.java 80.00% <100.00%> (-7.15%) ⬇️
...ery/runtime/operator/operands/FunctionOperand.java 100.00% <100.00%> (ø)
...uery/runtime/operator/operands/LiteralOperand.java 100.00% <100.00%> (ø)
...ry/runtime/operator/operands/ReferenceOperand.java 100.00% <100.00%> (ø)
...g/apache/pinot/server/api/resources/ErrorInfo.java 0.00% <0.00%> (-100.00%) ⬇️
...data/manager/realtime/DefaultSegmentCommitter.java 0.00% <0.00%> (-80.00%) ⬇️
... and 190 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@walterddr walterddr changed the title [multistage] support having [multistage] support HAVING clause Aug 24, 2022
@@ -57,6 +57,8 @@ protected Object[][] provideQueries() {
+ " GROUP BY a.col1, a.col2"},
new Object[]{"SELECT a.col1, AVG(b.col3) FROM a JOIN b ON a.col1 = b.col2 "
+ " WHERE a.col3 >= 0 AND a.col2 = 'a' AND b.col3 < 0 GROUP BY a.col1"},
new Object[]{"SELECT a.col1, COUNT(*) FROM a WHERE a.col3 >= 0 AND a.col2 = 'a' GROUP BY a.col1"
+ " HAVING COUNT(*) > 10"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding one test each for each supported comparison operator and may be also include a different aggregation function



public class FilterOperator extends BaseOperator<TransferableBlock> {
private static final String EXPLAIN_NAME = "HAVING_FILTER";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is giving the impression that it is a hard-coded plan. Currently, the multi-stage engine plan does not support filter in the intermediary stages and wherever possible, filter is pushed down to the leaf stage.

With this PR, we are implementing a FilterOperator that can be used in any of the stages (above leaf), but we are treating it as specifically for executing HAVING filter which may not be the case in future for generic queries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. actually this should just be filter operator. since not only HAVING could generate a intermediary filter node. but also non-equality JOIN clauses like SELECT * FROM a JOIN b on a.key = b.key WHERE a.val > b.val.
it would be a rather inefficient plan to ship all equality over though

@siddharthteotia siddharthteotia merged commit 8e17e63 into apache:master Sep 1, 2022
@Jackie-Jiang Jackie-Jiang added the multi-stage Related to the multi-stage query engine label Sep 13, 2022
@walterddr walterddr deleted the pr_having branch December 6, 2023 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multi-stage Related to the multi-stage query engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants