-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prune unselected THEN statements in CaseTransformFunction #8138
prune unselected THEN statements in CaseTransformFunction #8138
Conversation
size arrays to the block size do not eagerly format exception messages construct BigDecimal only once in LiteralTransformFunction
Codecov Report
@@ Coverage Diff @@
## master #8138 +/- ##
=============================================
- Coverage 71.39% 30.67% -40.73%
=============================================
Files 1624 1613 -11
Lines 84198 83886 -312
Branches 12602 12584 -18
=============================================
- Hits 60116 25734 -34382
- Misses 19970 55859 +35889
+ Partials 4112 2293 -1819
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@@ -58,6 +57,8 @@ | |||
|
|||
private List<TransformFunction> _whenStatements = new ArrayList<>(); | |||
private List<TransformFunction> _elseThenStatements = new ArrayList<>(); | |||
private boolean[] _selections; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) brief comment to explain what is _selections. I am guessing this is to track if a statement is selected or not ?
@@ -102,8 +104,9 @@ private TransformResultMetadata calculateResultMetadata() { | |||
for (int i = 0; i < numThenStatements; i++) { | |||
TransformFunction thenStatement = _elseThenStatements.get(i + 1); | |||
TransformResultMetadata thenStatementResultMetadata = thenStatement.getResultMetadata(); | |||
Preconditions.checkState(thenStatementResultMetadata.isSingleValue(), | |||
String.format("Unsupported multi-value expression in the THEN clause of index: %d", i)); | |||
if (!thenStatementResultMetadata.isSingleValue()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do this instead of Preconditions.checkState()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably to avoid String.format
being evaluated unnecessarily?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise you call String.format("Unsupported multi-value expression in the THEN clause of index: %d", i)
every time a function is initialised, which will jump out at you fairly quickly in an allocation profile.
numSelections++; | ||
} | ||
} | ||
_numSelections = numSelections; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a bitmap instead of boolean array ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you have fewer than 64 cases (a large case statement) all updates to the bitmap would be to the same word, which creates a data dependency in the loop, which slows the loop down.
} | ||
int numWhenStatements = _whenStatements.size(); | ||
for (int i = 0; i < numWhenStatements; i++) { | ||
for (int i = numWhenStatements - 1; i >= 0; i--) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why loop needs to be reversed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allows branch-free setting of the highest priority case below (note that the statement numbers increase)
int[] intValues = transformFunction.transformToIntValuesSV(projectionBlock); | ||
if (_numSelections == 1) { | ||
System.arraycopy(intValues, 0, _intResults, 0, numDocs); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is checking for _numSelections == 1
and copy really needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the alternative is the loop below which handles the generic case, which is a lot slower.
This PR prunes evaluation of THEN cases when they are not selected at the block level. This can happen when e.g. the conditions are applied to the values of a sorted column, or if a CASE holding is very rare and the THEN statement is a fallback.
This also includes some reductions in allocations in
LiteralTransformFunction
andLogicalOperatorTransformFunction
, which are often used with CASE statements.This speeds up query evaluation even when there are no branches to prune because selecting the cases is slightly more efficient, and some unnecessary allocations are removed:
master
branch