Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistage] Derive SUM return type to be PostgreSQL compatible #11151

Merged
merged 3 commits into from
Aug 10, 2023

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Jul 21, 2023

Derive sum return type for v2.
Based on https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE

for query:

SELECT SUM(DestAirportSeqID) FROM airlineStats

Before:
image

After:
image

@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2023

Codecov Report

Merging #11151 (cd72e87) into master (4849093) will increase coverage by 0.11%.
Report is 15 commits behind head on master.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master   #11151      +/-   ##
==========================================
+ Coverage    0.00%    0.11%   +0.11%     
==========================================
  Files        2214     2231      +17     
  Lines      119705   120181     +476     
  Branches    18163    18227      +64     
==========================================
+ Hits            0      137     +137     
- Misses     119705   120024     +319     
- Partials        0       20      +20     
Flag Coverage Δ
integration1temurin11 0.00% <0.00%> (ø)
integration1temurin17 0.00% <0.00%> (ø)
integration1temurin20 0.00% <0.00%> (ø)
integration2temurin11 0.00% <0.00%> (ø)
integration2temurin17 0.00% <0.00%> (ø)
integration2temurin20 0.00% <0.00%> (ø)
unittests1temurin11 0.00% <0.00%> (ø)
unittests1temurin17 0.00% <0.00%> (ø)
unittests1temurin20 0.00% <0.00%> (ø)
unittests2temurin11 0.11% <0.00%> (?)
unittests2temurin17 0.11% <0.00%> (?)
unittests2temurin20 0.11% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
.../java/org/apache/pinot/query/QueryEnvironment.java 0.00% <0.00%> (ø)
...n/java/org/apache/pinot/query/type/TypeSystem.java 0.00% <0.00%> (ø)

... and 31 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xiangfu0 xiangfu0 force-pushed the fix-sum-type-v2 branch 5 times, most recently from dc7b159 to 8d4c33e Compare July 23, 2023 07:56
@@ -21,7 +21,7 @@ where
)
and ps_availqty > (
select
0.5 * sum(l_quantity)
CAST(0.5 * SUM(l_quantity) AS DECIMAL(19, 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem with this is b/c the 0.5 has DECIMAL value and it is a Literal
this will cause Calcite to explicitly hoist the SUM results by a specific precision and scale.

no one will actually use this in practices, and i suggest we disable this test until further fix can be put in place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is for join schema mismatch from both side, one is ps_availqty the type is decimal(19, 1), the other side is 0.5 * SUM(l_quantity), which is decimal(21,1).

@@ -313,22 +313,78 @@
},
{
"description": "Correlated join",
"sql": "EXPLAIN PLAN FOR SELECT a.col1 FROM a WHERE a.col4 > (SELECT 0.5 * SUM(b.col3) FROM b WHERE b.col2 = a.col2 AND b.col1 = a.col1)",
"sql": "EXPLAIN PLAN FOR SELECT a.col1 FROM a WHERE a.col4 > (SELECT CAST(0.5 * SUM(b.col3) AS DECIMAL(19, 1)) FROM b WHERE b.col2 = a.col2 AND b.col1 = a.col1)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast as double here and put a TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query failed due to a de-correlation issue for type mismatch of DECIMAL(19, 1) and ``DECIMAL(21, 1)`


java.lang.AssertionError: Cannot add expression of different type to set:
set type is RecordType(VARCHAR NOT NULL col1, VARCHAR NOT NULL col2, INTEGER NOT NULL col3, DECIMAL(1000, 0) NOT NULL col4, BOOLEAN NOT NULL col5, INTEGER NOT NULL col6, BIGINT NOT NULL ts, DECIMAL(21, 1) EXPR$0) NOT NULL
expression type is RecordType(VARCHAR NOT NULL col1, VARCHAR NOT NULL col2, INTEGER NOT NULL col3, DECIMAL(1000, 0) NOT NULL col4, BOOLEAN NOT NULL col5, INTEGER NOT NULL col6, BIGINT NOT NULL ts, DECIMAL(19, 1) EXPR$0) NOT NULL
set is rel#6543:LogicalCorrelate.(left=HepRelVertex#6533,right=HepRelVertex#6542,correlation=$cor0,joinType=left,requiredColumns={0, 1})
expression is LogicalProject(col1=[$0], col2=[$1], col3=[$2], col4=[$3], col5=[$4], col6=[$5], ts=[$6], EXPR$0=[*(0.5:DECIMAL(2, 1), $7)])
  LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0, 1}])
    LogicalTableScan(table=[[a]])
    LogicalAggregate(group=[{}], agg#0=[SUM($0)])
      LogicalProject(col3=[$2])
        LogicalFilter(condition=[AND(=($1, $cor0.col2), =($0, $cor0.col1))])
          LogicalTableScan(table=[[b]])


	at org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:391)
	at org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:60)
	at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:269)
	at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:284)
	at org.apache.calcite.sql2rel.RelDecorrelator$AdjustProjectForCountAggregateRule.onMatch2(RelDecorrelator.java:2703)
	at org.apache.calcite.sql2rel.RelDecorrelator$AdjustProjectForCountAggregateRule.onMatch(RelDecorrelator.java:2610)
	at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337)
	at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
	at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420)
	at org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243)
	at org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178)
	at org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211)
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
	at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210)
	at org.apache.calcite.plan.hep.HepProgram$State.execute(HepProgram.java:118)
	at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:205)
	at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:191)
	at org.apache.calcite.sql2rel.RelDecorrelator.decorrelate(RelDecorrelator.java:291)
	at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateQuery(RelDecorrelator.java:230)
	at org.apache.pinot.query.QueryEnvironment.decorrelateIfNeeded(QueryEnvironment.java:282)
	at org.apache.pinot.query.QueryEnvironment.compileQuery(QueryEnvironment.java:275)
	at org.apache.pinot.query.QueryEnvironment.explainQuery(QueryEnvironment.java:202)
	at org.apache.pinot.query.QueryEnvironment.explainQuery(QueryEnvironment.java:220)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally the max precision and scale is 1000, per: https://github.com/apache/pinot/blob/master/pinot-query-planner/src/main/java/org/apache/pinot/query/type/TypeSystem.java#L32

So the intermediate decimal results type may go pretty wild.

@xiangfu0 xiangfu0 requested a review from walterddr July 24, 2023 07:36
@xiangfu0
Copy link
Contributor Author

We still need to find a way to solve this decimal type hoist problem.
And for now we need to make sure the error message is clear and explicit type casting can be applied correctly.

@walterddr
Copy link
Contributor

based on https://www.postgresql.org/docs/8.2/functions-aggregate.html and https://learn.microsoft.com/en-us/sql/t-sql/functions/sum-transact-sql?view=sql-server-ver16 it doesn't seem to have a consistent requirement for type hoisting in SQL standards.

@Jackie-Jiang Jackie-Jiang added bugfix multi-stage Related to the multi-stage query engine labels Jul 25, 2023
Copy link
Contributor

@vvivekiyer vvivekiyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// SUM result will overflow INTEGER
query = "SELECT SUM(CAST(ActualElapsedTime AS DOUBLE)) FROM mytable";

testQuery(query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a runtime test that verifies SUM (and other aggregations) on BIG_DECIMAL column? I remember testing this long back and there were some issues. Not sure if they've been fixed now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really. given the fact that we are currently working around the type-hoisting behavior to not do any automatic conversion. it should be same type in same type out

@Jackie-Jiang
Copy link
Contributor

For reference, here is the latest PostgreSQL document for result types: https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
The difference between this and old one (changed since 9.0) is the result type for real (float)

@walterddr
Copy link
Contributor

i think what's missing is that

  • calcite supports type hoisting in the fine-grain precision/scale fashion, and any decimal literal used in SUM or other agg function can affect the behavior of type hoisting.
  • in Postgres the type conversion doesn't depend on the literals, only the operand types.

one solution I can think of is to register our own behavior by adding our own OperatorTable and our own return type inference

this way we don't need to worry about the TypeFactory/TypeSystem overrides which intricately went through Calcite internal where we don't have direct control.

@xiangfu0 xiangfu0 changed the title [multistage] [bugfix] Derive SUM return type [multistage] Derive SUM return type Aug 9, 2023
@xiangfu0 xiangfu0 force-pushed the fix-sum-type-v2 branch 3 times, most recently from 621b47b to b6c9f2f Compare August 10, 2023 08:44
@xiangfu0 xiangfu0 changed the title [multistage] Derive SUM return type [multistage] Derive SUM return type to be PostgreSQL compatible Aug 10, 2023
@xiangfu0 xiangfu0 force-pushed the fix-sum-type-v2 branch 2 times, most recently from bede511 to 292099d Compare August 10, 2023 10:14
Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. thank you for fixing this issue

@xiangfu0 xiangfu0 merged commit 3bdc0e1 into apache:master Aug 10, 2023
@xiangfu0 xiangfu0 deleted the fix-sum-type-v2 branch August 10, 2023 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix multi-stage Related to the multi-stage query engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants