-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions #14264
Changes from all commits
958c77a
52987ec
d2c24c2
0160dae
103051b
4867f2c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
/** | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
package org.apache.pinot.common.collections; | ||
|
||
import java.util.AbstractList; | ||
|
||
|
||
/** | ||
* An immutable list like the one returned by {@link java.util.Collections#nCopies(int, Object)}, but with two values | ||
* (that are not interleaved) instead of a single one. Useful for avoiding unnecessary allocations. | ||
*/ | ||
public class DualValueList<E> extends AbstractList<E> { | ||
|
||
private final E _firstValue; | ||
private final E _secondValue; | ||
private final int _firstCount; | ||
private final int _totalSize; | ||
|
||
public DualValueList(E firstValue, int firstCount, E secondValue, int secondCount) { | ||
_firstValue = firstValue; | ||
_firstCount = firstCount; | ||
_secondValue = secondValue; | ||
_totalSize = firstCount + secondCount; | ||
} | ||
|
||
@Override | ||
public E get(int index) { | ||
if (index < 0 || index >= _totalSize) { | ||
throw new IndexOutOfBoundsException(index); | ||
} | ||
return index < _firstCount ? _firstValue : _secondValue; | ||
} | ||
|
||
@Override | ||
public int size() { | ||
return _totalSize; | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -326,6 +326,31 @@ public void testAggregateServerReturnFinalResult(boolean useMultiStageQueryEngin | |
assertTrue(response.get("resultTable").get("rows").get(0).get(0).isNull()); | ||
} | ||
|
||
@Test | ||
public void testWindowFunctionIgnoreNulls() | ||
throws Exception { | ||
// Window functions are only supported in the multi-stage query engine | ||
setUseMultiStageQueryEngine(true); | ||
String sqlQuery = | ||
"SELECT salary, LAST_VALUE(salary) IGNORE NULLS OVER (ORDER BY DaysSinceEpoch) AS gapfilledSalary from " | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it work with bounded preceding/following? I remember running into some exception when trying it out There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it works with bounded preceding / following as well.
Was it something like this error during query planning:
Interestingly, it looks like Calcite throws this error if the window function's input column is not nullable in the table schema (i.e., if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we want to document it. Users might enable table level nullability (v1 engine nullability) and expect v2 engine to pick it up There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, good point, I'll make sure to add a note about this to the documentation. I plan to raise one consolidated documentation PR with changes from #14273 and here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we should delegate this into something we document. We should also report the issue in Calcite Jira/email list and/or create a PR to fix it ourselfs. Same happened with the reserved keyword PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, actually this does look like a legitimate bug in Calcite on taking a second look. The issue is that the inferred return type for the window function in the parsed Interestingly, the same error and issue also occurs when the I'll create a bug tracking Jira in the Calcite project and link it here. |
||
+ "mytable"; | ||
JsonNode response = postQuery(sqlQuery); | ||
assertNoError(response); | ||
|
||
// Check if the LAST_VALUE window function with IGNORE NULLS has effectively gap-filled the salary values | ||
Integer lastSalary = null; | ||
JsonNode rows = response.get("resultTable").get("rows"); | ||
for (int i = 0; i < rows.size(); i++) { | ||
JsonNode row = rows.get(i); | ||
if (!row.get(0).isNull()) { | ||
assertEquals(row.get(0).asInt(), row.get(1).asInt()); | ||
lastSalary = row.get(0).asInt(); | ||
} else { | ||
assertEquals(lastSalary, row.get(1).numberValue()); | ||
} | ||
} | ||
} | ||
|
||
@Override | ||
protected void overrideBrokerConf(PinotConfiguration brokerConf) { | ||
brokerConf.setProperty(CommonConstants.Broker.CONFIG_OF_BROKER_QUERY_ENABLE_NULL_HANDLING, "true"); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,6 +32,7 @@ | |
import org.apache.calcite.sql.SqlOperator; | ||
import org.apache.calcite.sql.SqlOperatorTable; | ||
import org.apache.calcite.sql.SqlSyntax; | ||
import org.apache.calcite.sql.fun.SqlLeadLagAggFunction; | ||
import org.apache.calcite.sql.fun.SqlStdOperatorTable; | ||
import org.apache.calcite.sql.type.OperandTypes; | ||
import org.apache.calcite.sql.type.ReturnTypes; | ||
|
@@ -134,10 +135,6 @@ public static PinotOperatorTable instance() { | |
SqlStdOperatorTable.MODE, | ||
SqlStdOperatorTable.MIN, | ||
SqlStdOperatorTable.MAX, | ||
SqlStdOperatorTable.LAST_VALUE, | ||
SqlStdOperatorTable.FIRST_VALUE, | ||
SqlStdOperatorTable.LEAD, | ||
SqlStdOperatorTable.LAG, | ||
SqlStdOperatorTable.AVG, | ||
SqlStdOperatorTable.STDDEV_POP, | ||
SqlStdOperatorTable.COVAR_POP, | ||
|
@@ -152,7 +149,17 @@ public static PinotOperatorTable instance() { | |
SqlStdOperatorTable.RANK, | ||
SqlStdOperatorTable.ROW_NUMBER, | ||
|
||
// WINDOW Functions (non-aggregate) | ||
SqlStdOperatorTable.LAST_VALUE, | ||
SqlStdOperatorTable.FIRST_VALUE, | ||
// TODO: Replace these with SqlStdOperatorTable.LEAD and SqlStdOperatorTable.LAG when the function implementations | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this mean There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Nope, using I'd initially gone with simply throwing a runtime exception in Pinot's |
||
// are updated to support the IGNORE NULLS option. | ||
PinotLeadWindowFunction.INSTANCE, | ||
PinotLagWindowFunction.INSTANCE, | ||
|
||
// SPECIAL OPERATORS | ||
SqlStdOperatorTable.IGNORE_NULLS, | ||
SqlStdOperatorTable.RESPECT_NULLS, | ||
SqlStdOperatorTable.BETWEEN, | ||
SqlStdOperatorTable.SYMMETRIC_BETWEEN, | ||
SqlStdOperatorTable.NOT_BETWEEN, | ||
|
@@ -372,4 +379,30 @@ public void lookupOperatorOverloads(SqlIdentifier opName, @Nullable SqlFunctionC | |
public List<SqlOperator> getOperatorList() { | ||
return _operatorList; | ||
} | ||
|
||
private static class PinotLeadWindowFunction extends SqlLeadLagAggFunction { | ||
static final SqlOperator INSTANCE = new PinotLeadWindowFunction(); | ||
|
||
public PinotLeadWindowFunction() { | ||
super(SqlKind.LEAD); | ||
} | ||
|
||
@Override | ||
public boolean allowsNullTreatment() { | ||
return false; | ||
} | ||
} | ||
|
||
private static class PinotLagWindowFunction extends SqlLeadLagAggFunction { | ||
static final SqlOperator INSTANCE = new PinotLagWindowFunction(); | ||
|
||
public PinotLagWindowFunction() { | ||
super(SqlKind.LAG); | ||
} | ||
|
||
@Override | ||
public boolean allowsNullTreatment() { | ||
return false; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we divide this test method into two different ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow your suggestion - there's a single query here.