Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions #14264

yashmayya · 2024-10-21T16:45:43Z

The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for the window functions LEAD, LAG, FIRST_VALUE, LAST_VALUE, and NTH_VALUE (although Pinot currently doesn't support this function). The default behavior is RESPECT NULLS.
This patch adds support for these options on the FIRST_VALUE and LAST_VALUE window functions (LEAD / LAG can be added in a subsequent patch). As the name suggests, the IGNORE NULLS option makes it so that the FIRST_VALUE and LAST_VALUE window functions compute the first and last non-null values respectively for each window frame.
If IGNORE NULLS is specified like LAST_VALUE(col1) IGNORE NULLS OVER (ORDER BY ts), it can effectively be used to gapfill data (see this article for example - https://learn.microsoft.com/en-us/azure/azure-sql-edge/imputing-missing-values).
Calcite has validation to ensure that the IGNORE NULLS / RESPECT NULLS operators are only used with window functions that they are applicable to as per standard SQL. This patch also updates the operators being registered in Pinot's operator table for LEAD / LAG since we don't currently support the null related options for those functions (this way, we fail during query planning rather than at runtime).
There are also some minor changes to the query plan serde here to hold the IGNORE NULLS option for a window function call.

codecov-commenter · 2024-10-21T17:21:54Z

Codecov Report

Attention: Patch coverage is 88.94231% with 23 lines in your changes missing coverage. Please review.

Project coverage is 63.82%. Comparing base (59551e4) to head (4867f2c).
Report is 1256 commits behind head on master.

Files with missing lines	Patch %	Lines
...operator/window/value/LastValueWindowFunction.java	89.15%	3 Missing and 6 partials ⚠️
...perator/window/value/FirstValueWindowFunction.java	90.80%	2 Missing and 6 partials ⚠️
...apache/pinot/common/collections/DualValueList.java	80.00%	1 Missing and 1 partial ⚠️
...ache/pinot/calcite/sql/fun/PinotOperatorTable.java	75.00%	2 Missing ⚠️
.../query/planner/logical/PlanNodeToRelConverter.java	0.00%	1 Missing ⚠️
...ime/operator/window/value/ValueWindowFunction.java	80.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #14264      +/-   ##
============================================
+ Coverage     61.75%   63.82%   +2.07%     
- Complexity      207     1556    +1349     
============================================
  Files          2436     2660     +224     
  Lines        133233   145674   +12441     
  Branches      20636    22287    +1651     
============================================
+ Hits          82274    92981   +10707     
- Misses        44911    45822     +911     
- Partials       6048     6871     +823

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (+99.99%)`	⬆️
integration	`100.00% <ø> (+99.99%)`	⬆️
integration1	`100.00% <ø> (+99.99%)`	⬆️
integration2	`0.00% <ø> (ø)`
java-11	`63.80% <88.94%> (+2.09%)`	⬆️
java-21	`63.65% <88.94%> (+2.02%)`	⬆️
skip-bytebuffers-false	`63.82% <88.94%> (+2.07%)`	⬆️
skip-bytebuffers-true	`63.63% <88.94%> (+35.90%)`	⬆️
temurin	`63.82% <88.94%> (+2.07%)`	⬆️
unittests	`63.82% <88.94%> (+2.07%)`	⬆️
unittests1	`55.44% <88.94%> (+8.55%)`	⬆️
unittests2	`34.25% <2.88%> (+6.52%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Jackie-Jiang

Suggest adding some end-to-end tests into ResourceBasedQueriesTest. See WindowFunctions.json for references

Jackie-Jiang · 2024-10-24T18:38:37Z

...tion-tests/src/test/java/org/apache/pinot/integration/tests/NullHandlingIntegrationTest.java

+    // Window functions are only supported in the multi-stage query engine
+    setUseMultiStageQueryEngine(true);
+    String sqlQuery =
+        "SELECT salary, LAST_VALUE(salary) IGNORE NULLS OVER (ORDER BY DaysSinceEpoch) AS gapfilledSalary from "


Does it work with bounded preceding/following? I remember running into some exception when trying it out

Yes, it works with bounded preceding / following as well.

I remember running into some exception when trying it out

Was it something like this error during query planning:

Caused by: java.lang.RuntimeException: Failed to convert query to relational expression: ... Caused by: java.lang.AssertionError: Conversion to relational algebra failed to preserve datatypes:

Interestingly, it looks like Calcite throws this error if the window function's input column is not nullable in the table schema (i.e., if enableColumnBasedNullHandling is false or the column has "notNull": true) and IGNORE NULLS or RESPECT NULLS option is used. While the error message is not the most clear, I don't think this is ~~an actual bug~~ a major issue because it doesn't really make sense to use those null handling related options if the window function's input column is not nullable.

I guess we want to document it. Users might enable table level nullability (v1 engine nullability) and expect v2 engine to pick it up

Yeah, good point, I'll make sure to add a note about this to the documentation. I plan to raise one consolidated documentation PR with changes from #14273 and here.

I don't think we should delegate this into something we document. We should also report the issue in Calcite Jira/email list and/or create a PR to fix it ourselfs. Same happened with the reserved keyword PR.

Yes, actually this does look like a legitimate bug in Calcite on taking a second look. The issue is that the inferred return type for the window function in the parsed SqlNode is INTEGER NOT NULL when IGNORE NULLS option is used (assuming the column input to FIRST_VALUE / LAST_VALUE is INTEGER) and the converted return type is INTEGER (nullable) because offset based window frame bounds means that the result can be null when the window frame is out of bounds - which can't be the case when using RANGE window frames or ROWS window frames with UNBOUNDED PRECEDING / UNBOUNDED FOLLOWING / CURRENT ROW. When IGNORE NULLS option is not provided, the inferred return type for the window function in the parsed SqlNode is also INTEGER (nullable) which is why the issue doesn't occur there. Same when IGNORE NULLS option is used but input column is nullable.

Interestingly, the same error and issue also occurs when the RESPECT NULLS option is explicitly provided.

I'll create a bug tracking Jira in the Calcite project and link it here.

Edit: https://issues.apache.org/jira/browse/CALCITE-6648

Jackie-Jiang · 2024-10-24T18:38:46Z

pinot-query-planner/src/main/java/org/apache/pinot/calcite/sql/fun/PinotOperatorTable.java

+      // WINDOW Functions (non-aggregate)
+      SqlStdOperatorTable.LAST_VALUE,
+      SqlStdOperatorTable.FIRST_VALUE,
+      // TODO: Replace these with SqlStdOperatorTable.LEAD and SqlStdOperatorTable.LAG when the function implementations


Does this mean IGNORE NULLS are simply ignored?
I'd suggest using the standard operator, and throw exception when IGNORE NULLS is specified but cannot be supported to make the behavior more explicit

Does this mean IGNORE NULLS are simply ignored?

Nope, using IGNORE NULLS with LAG / LEAD will lead to a clear error like this during query planning - From line 1, column 43 to line 1, column 60: Cannot specify IGNORE NULLS or RESPECT NULLS following 'LAG'. This is because the custom operators we defined return false for allowsNullTreatment which means this Calcite validation will fail - https://github.com/apache/calcite/blob/ef1a83f659e8771c65c2541b92d2ef9cc2a05bea/core/src/main/java/org/apache/calcite/sql/SqlNullTreatmentOperator.java#L69-L74.

I'd initially gone with simply throwing a runtime exception in Pinot's LagValueWindowFunction / LeadValueWindowFunction runtime operators, but the alternative chosen here suggested by @gortiz (defining our own custom SqlAggFunctions) is much better because we fail fast during query planning instead of query execution and the error is also clear.

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

Jackie-Jiang · 2024-10-27T23:49:33Z

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

+      }
+      lowerBound++;
+
+      if (upperBound < numRows - 1) {


Some javadoc would help explain the logic here

Jackie-Jiang · 2024-10-27T23:50:07Z

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

+      }
+
+      // Slide the window forward by one row
+      if (indexOfFirstNonNullValue == lowerBound) {


Should we do this check only if indexOfFirstNonNullValue != -1?

Good catch, we'd be doing a pointless iteration from 0 to upper bound when we're at lowerBound = -1 👍

Jackie-Jiang · 2024-10-27T23:51:36Z

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

+    return result;
+  }
+
+  private List<Object> processRowsWindowIgnoreNulls(List<Object[]> rows) {


Should we short circuit the unbounded case?

I don't think there's much benefit to that here? In the unbounded case for FIRST_VALUE / LAST_VALUE with IGNORE NULLS, we'll find the first / last non-null value in the first window (which will encompass all rows) at the beginning and then in each iteration there's only some simple boolean checks which will all be false in every iteration of the loop and we'll simply keep adding the same value to the result list.

We can replace the for loop to nCopy, and also short-circuit all if checks. The total cost of the query is just finding the first/last non-null value, so the save could potentially be relatively significant.

From high level, for UNBOUNDED PROCEEDING, CURRENT ROW, UNBOUNDED FOLLOWING, the behavior of ROWS and RANGES should be the same. So we could potentially split the handling into 3 cases:

Without bounded proceeding/following

ROWS with bounded proceeding/following

RANGE with bounded proceeding/following - not supported

We can discuss and address this in a separate PR

From high level, for UNBOUNDED PROCEEDING, CURRENT ROW, UNBOUNDED FOLLOWING, the behavior of ROWS and RANGES should be the same

The behavior will only be the same for UNBOUNDED PRECEDING TO UNBOUNDED FOLLOWING right? With CURRENT ROW as either lower or upper bound or both, the behavior and logic are both significantly different due to the need to consider peer groups (rows before and after with the same order key) along with current row with RANGE type window frames.

We can replace the for loop to nCopy, and also short-circuit all if checks. The total cost of the query is just finding the first/last non-null value, so the save could potentially be relatively significant.

This makes sense for true unbounded case though (UNBOUNDED PRECEDING TO UNBOUNDED FOLLOWING, CURRENT ROW is not really unbounded), I've raised a small follow-up PR - #14324.

I see. Good point on the peer group difference

gortiz · 2024-10-28T10:45:10Z

...tion-tests/src/test/java/org/apache/pinot/integration/tests/NullHandlingIntegrationTest.java

@@ -326,6 +326,31 @@ public void testAggregateServerReturnFinalResult(boolean useMultiStageQueryEngin
    assertTrue(response.get("resultTable").get("rows").get(0).get(0).isNull());
  }

+  @Test
+  public void testWindowFunctionIgnoreNulls()


Can we divide this test method into two different ones?

I'm not sure I follow your suggestion - there's a single query here.

...anner/src/main/java/org/apache/pinot/query/planner/serde/RexExpressionToProtoExpression.java

gortiz · 2024-10-28T11:03:10Z

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

+          }
+        }
+      }
+      lowerBound++;


This is assuming lowerBound is Integer.MIN_VALUE if unbounded, right? Although it can never turn into 0 due to the fact that maxRows is bound to Integer.MAX_VALUE... don't you think it is a bit difficult to understand?

Yep, this convention was added in #14273, and was chosen since it was the least invasive given the existing framework for window functions. I've documented this in a couple of places -

pinot/pinot-query-planner/src/main/java/org/apache/pinot/query/planner/plannode/WindowNode.java

Lines 33 to 37 in 0f984e8

// Both these bounds are relative to current row; 0 means current row, -1 means previous row, 1 means next row, etc.

// Integer.MIN_VALUE represents UNBOUNDED PRECEDING which is only allowed for the lower bound (ensured by Calcite).

// Integer.MAX_VALUE represents UNBOUNDED FOLLOWING which is only allowed for the upper bound (ensured by Calcite).

private final int _lowerBound;

private final int _upperBound;

pinot/pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/WindowFrame.java

Lines 31 to 35 in 0f984e8

// Both these bounds are relative to current row; 0 means current row, -1 means previous row, 1 means next row, etc.

// Integer.MIN_VALUE represents UNBOUNDED PRECEDING which is only allowed for the lower bound (ensured by Calcite).

// Integer.MAX_VALUE represents UNBOUNDED FOLLOWING which is only allowed for the upper bound (ensured by Calcite).

private final int _lowerBound;

private final int _upperBound;

Although it can never turn into 0 due to the fact that maxRows is bound to Integer.MAX_VALUE... don't you think it is a bit difficult to understand?

The logic here doesn't take that assumption into account though, we're handling all cases of lowerBound / upperBound (-ve / +ve). This way is also convenient because we don't need to worry about handling overflows everywhere.

gortiz · 2024-10-28T11:05:32Z

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

+    for (int i = Math.max(lowerBound, 0); i <= upperBound; i++) {
+      Object value = extractValueFromRow(rows.get(i));
+      if (value != null) {
+        indexOfFirstNonNullValue = i;
+        break;
+      }


This code is also repeated when the lower bound is moved. I think it is worth it to move the code to its own function. Something like findFirstNotNullInWindow(rows, lowerBound, upperBound). Not only makes the code easier to read but also makes the job easier for the jit trying to optimize the loop.

gortiz · 2024-10-28T11:16:26Z

...main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java

+      List<Object> result = new ArrayList<>(numRows);
+      // Find the start of the peer group of the row with the first non-null value
+      int i;
+      for (i = 0; i < numRows; i++) {
+        Object[] row = rows.get(i);
+        Key orderKey = AggregationUtils.extractRowKey(row, _orderKeys);
+        if (orderKey.equals(firstNonNullValueKey)) {
+          break;
+        } else {
+          result.add(null);
+        }
+      }
+
+      Object firstNonNullValue = extractValueFromRow(rows.get(firstNonNullValueIndex));
+      for (; i < numRows; i++) {
+        result.add(firstNonNullValue);
+      }
+
+      return result;


nit: Here we could reduce allocation cost if using a custom list that could combine two lists. I'm shocked about the lack of a list like that in Guava, but shouldn't be difficult to create one. The idea would be to create two lists like Collections.nCopies. The first would contain just nulls and the second just the first non null value. Finally we wrap these two instances in a view whose get delegates on either the first or the second list depending on whether the index is greater or smaller than firstNonNullValueIndex.

If this pattern is seen in more window functions, to create this combine list view would worth the effort

Nice suggestion! I couldn't find any such off the shelf implementation either so I've created a small new class called DualValueList and added it to pinot-common in case we find use cases elsewhere in the codebase.

gortiz

Remember you can remove several //@Formatter:off/on

… more verbose comments explaining logic in FIRST_VALUE / LAST_VALUE with IGNORE NULLS and ROWS window frame; initialize ArrayList with known final size

…n has only 1 or 2 repeated values

yashmayya changed the title ~~Window function ignore nulls~~ Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions Oct 21, 2024

yashmayya added feature release-notes Referenced by PRs that need attention when compiling the next release notes multi-stage Related to the multi-stage query engine labels Oct 21, 2024

yashmayya force-pushed the window-function-ignore-nulls branch 5 times, most recently from 6a01d4b to f766ec0 Compare October 24, 2024 18:29

Jackie-Jiang reviewed Oct 24, 2024

View reviewed changes

yashmayya marked this pull request as ready for review October 25, 2024 10:23

yashmayya force-pushed the window-function-ignore-nulls branch from 4057a4d to 95ecd6f Compare October 25, 2024 10:26

Jackie-Jiang reviewed Oct 27, 2024

View reviewed changes

yashmayya force-pushed the window-function-ignore-nulls branch from 05462ad to 69dc900 Compare October 28, 2024 06:51

gortiz reviewed Oct 28, 2024

View reviewed changes

...anner/src/main/java/org/apache/pinot/query/planner/serde/RexExpressionToProtoExpression.java Outdated Show resolved Hide resolved

gortiz reviewed Oct 28, 2024

View reviewed changes

gortiz approved these changes Oct 28, 2024

View reviewed changes

yashmayya added 5 commits October 28, 2024 17:36

Add IGNORE NULLS support to FIRST_VALUE and LAST_VALUE window functions

958c77a

Minor refactor; add some test cases to ResourceBasedQueriesTest

52987ec

Optimize logic for computing last non-null value in first window; add…

d2c24c2

… more verbose comments explaining logic in FIRST_VALUE / LAST_VALUE with IGNORE NULLS and ROWS window frame; initialize ArrayList with known final size

Minor refactors based on review suggestions

0160dae

Reduce allocations in cases where returned result from window functio…

103051b

…n has only 1 or 2 repeated values

yashmayya force-pushed the window-function-ignore-nulls branch from 69dc900 to 103051b Compare October 28, 2024 13:24

Remove newly redundant //@Formatter:off and //@formatter.on

4867f2c

Jackie-Jiang merged commit 6fd21f2 into apache:master Oct 28, 2024
20 of 21 checks passed

yashmayya mentioned this pull request Oct 29, 2024

Add minor optimization for FIRST_VALUE / LAST_VALUE with IGNORE NULLS and unbounded ROWS window frame #14324

Merged

yashmayya mentioned this pull request Nov 14, 2024

Update window function test plans to enable tests for supported functionality #14449

Merged

yashmayya added the window-functions Related to SQL window functions on the multi-stage query engine label Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions #14264

Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions #14264

yashmayya commented Oct 21, 2024 •

edited

Loading

codecov-commenter commented Oct 21, 2024 •

edited

Loading

Jackie-Jiang left a comment

Jackie-Jiang Oct 24, 2024

yashmayya Oct 25, 2024 •

edited

Loading

Jackie-Jiang Oct 27, 2024

yashmayya Oct 28, 2024

gortiz Oct 28, 2024

yashmayya Oct 28, 2024 •

edited

Loading

Jackie-Jiang Oct 24, 2024

yashmayya Oct 25, 2024

Jackie-Jiang Oct 27, 2024

Jackie-Jiang Oct 27, 2024

yashmayya Oct 28, 2024

Jackie-Jiang Oct 27, 2024

yashmayya Oct 28, 2024

Jackie-Jiang Oct 28, 2024

yashmayya Oct 29, 2024

Jackie-Jiang Oct 29, 2024

gortiz Oct 28, 2024

yashmayya Oct 28, 2024

gortiz Oct 28, 2024

yashmayya Oct 28, 2024

gortiz Oct 28, 2024

gortiz Oct 28, 2024

yashmayya Oct 28, 2024

gortiz left a comment

	// Both these bounds are relative to current row; 0 means current row, -1 means previous row, 1 means next row, etc.
	// Integer.MIN_VALUE represents UNBOUNDED PRECEDING which is only allowed for the lower bound (ensured by Calcite).
	// Integer.MAX_VALUE represents UNBOUNDED FOLLOWING which is only allowed for the upper bound (ensured by Calcite).
	private final int _lowerBound;
	private final int _upperBound;

Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions #14264

Add IGNORE NULLS option to FIRST_VALUE and LAST_VALUE window functions #14264

Conversation

yashmayya commented Oct 21, 2024 • edited Loading

codecov-commenter commented Oct 21, 2024 • edited Loading

Codecov Report

Jackie-Jiang left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yashmayya Oct 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yashmayya Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gortiz left a comment

Choose a reason for hiding this comment

yashmayya commented Oct 21, 2024 •

edited

Loading

codecov-commenter commented Oct 21, 2024 •

edited

Loading

yashmayya Oct 25, 2024 •

edited

Loading

yashmayya Oct 28, 2024 •

edited

Loading