Add support for linear range calculation in WINDOW functions #4989

mustafasrepo · 2023-01-19T15:43:07Z

Which issue does this PR close?

Closes #4979. Makes progress on #4904.

Rationale for this change

During range calculation for window frames, we can use linear search instead of bisect. Since we know that table is already sorted and we would progress only in one direction. Linear search is amortized constant (When window frame boundaries are static). Hence this version is more optimal than current bisect implementation (where search complexity is log(n)). This change decreases overall window range calculation complexity from O(n*log(n)) to O(n).

What changes are included in this PR?

Are these changes tested?

Existing tests verify window range calculation correctness. Unit tests for linear_search is also added. We also compared time elapsed during Window result calculation for the query

SELECT SUM(a) OVER(ORDER BY a RANGE BETWEEN 10 PRECEDING AND 10 FOLLOWING) FROM t1;

. Time comparison between linear and bisect version for different conditions can be seen in table below.

n_row	distinct	linear(mean)	linear(median)	bisect(mean)	bisect(median)
100	10	1.199524ms	772.125µs	2.065253ms	1.501125ms
100	100_000_000	1.614941ms	1.599208ms	2.30337ms	2.346416ms
1000	10	4.521795ms	4.417125ms	10.986941ms	10.994333ms
1000	100_000_000	11.82327ms	11.8265ms	23.83402ms	23.641541ms
100_000	10	428.01742ms	428.022791ms	1.661792712s	1.666531666s
100_000	100_000_000	1.190898003s	1.18136925s	3.446534628s	3.446675916s

Are there any user-facing changes?

No.

ozankabak · 2023-01-19T17:26:01Z

@alamb, I think you will like this. As I was reading the segment tree paper from #4904, one of the remarks therein that stood out to me was that in RANGE frames a simple linear search was preferred to bisections due to amortization. We wanted to check if this theoretical gain shows up in practice -- and it does bigly! This PR uses cuts down on bisect usage in appropriate places and uses a linear search to exploit this amortization.

Also, there seems to be an intermittent CI failure, if you re-run the failed workflows it should pass.

avantgardnerio · 2023-01-19T19:40:15Z

there seems to be an intermittent CI failure

Github is experiencing some serious issues today, so it's probably related to their status. I'll keep an eye on it and kick CI once in a while...

avantgardnerio

I'm not deeply familiar with this part of the code, so I'd like a backup review. But the tests pass and the numbers are better so I'm inclined to

avantgardnerio · 2023-01-19T20:15:40Z

datafusion/common/src/utils.rs

+pub fn get_row_at_idx(columns: &[ArrayRef], idx: usize) -> Result<Vec<ScalarValue>> {
+    columns
+        .iter()
+        .map(|arr| ScalarValue::try_from_array(arr, idx))


What dark magic converts this from an Vec<Result<ScalarValue>> to a Result<Vec<ScalarValue>>?

🙂 collect automagically leverages the info in the return type annotation, as if there was a let binding or if we had used collect::<...>.

Longer term, I think there is quite a bit more performance to be gained by avoiding the use of ScalarValue here and instead using something more optimized like the RowFormat. This is not something for this PR I am just trying to plan a seed if we need more performance, there is a path

avantgardnerio · 2023-01-19T20:29:00Z

datafusion/physical-expr/src/window/window_frame_state.rs

@@ -85,6 +88,7 @@ impl<'a> WindowFrameContext<'a> {
                sort_options,
                length,
                idx,
+                last_range,


The last_range being added everywhere is a performance optimization to not throw away the index which was previously calculated?

Yes, as it goes through batches, the search picks up from where it left off before. This works for fixed-boundary RANGE frames (which is what Datafusion supports).

alamb · 2023-01-19T22:06:11Z

I will plan to review this carefully tomorrow

alamb

Looks very nice 👍 thank you @mustafasrepo and @ozankabak -- the only thing I think that needs to be addressed prior to merging is my question about linear_search not being used (I may be confused)

Longer term I think there are additional optimization opportunities here but this PR is a great step forward

alamb · 2023-01-20T13:39:44Z

datafusion/common/src/utils.rs

+pub fn get_row_at_idx(columns: &[ArrayRef], idx: usize) -> Result<Vec<ScalarValue>> {
+    columns
+        .iter()
+        .map(|arr| ScalarValue::try_from_array(arr, idx))


Longer term, I think there is quite a bit more performance to be gained by avoiding the use of ScalarValue here and instead using something more optimized like the RowFormat. This is not something for this PR I am just trying to plan a seed if we need more performance, there is a path

datafusion/common/src/utils.rs

alamb · 2023-01-20T13:49:02Z

datafusion/common/src/utils.rs

+            DataFusionError::Internal("Column array shouldn't be empty".to_string())
+        })?
+        .len();
+    let compare_fn = |current: &[ScalarValue], target: &[ScalarValue]| {


FWIW the arrow kernels have https://docs.rs/arrow/31.0.0/arrow/compute/struct.LexicographicalComparator.html which you might be able to use and avoid having to construct ScalarValues to compare.

LexicographicalComparator API, compares values at the two indices and returns their ordering. This useful to find change detection, or partition boundaries. However, in our case we need to search for specific value inside Array (possibly not existing in the array.). However, maybe with some kind of tweak, we maybe able to use LexicographicalComparator for our use case. I will think about it in detail.

alamb · 2023-01-20T13:49:40Z

datafusion/physical-expr/src/window/partition_evaluator.rs

@@ -83,7 +83,7 @@ pub trait PartitionEvaluator: Debug + Send {
    fn evaluate_inside_range(
        &self,
        _values: &[ArrayRef],
-        _range: Range<usize>,
+        _range: &Range<usize>,


alamb · 2023-01-20T13:50:22Z

datafusion/physical-expr/src/window/window_frame_state.rs

+        } else {
+            last_range.end
+        };
+        let compare_fn = |current: &[ScalarValue], target: &[ScalarValue]| {


See the above comment about https://docs.rs/arrow/31.0.0/arrow/compute/struct.LexicographicalComparator.html possibly being another way to improve performance

alamb · 2023-01-20T13:51:09Z

datafusion/common/src/utils.rs

+where
+    F: Fn(&[ScalarValue], &[ScalarValue]) -> Result<bool>,
+{
+    while low < high {


I don't know if this would be faster or not, but I winder if you might be able to use .iter() and find() to find the relevant index and possibly avoid the bounds checks 🤔

Like if you could do something like item_columns.iter().enumerate().find(compare_fn).map(|(i, _)| i).unwrap()

I think you mean something like this:

Ok((low..high).find(|&idx| { let val = get_row_at_idx(item_columns, idx)?; !compare_fn(&val, target)? }).unwrap_or(high))

The problem is with the ? operators, we would need to change them to unwrap calls for this to work. The code would look nicer, but we would be incurring the downside of panicking in case something goes wrong. In general, I prefer to err on the side of being a little more verbose than necessary but retain control over errors, but I don't have a strong opinion on this specific case. What do you think?

I guess I was wondering if we could get an iter over item_columns somehow (and thus avoid all the bounds checks) -- I realize this is not really easy w/ multiple arrays. 🤔

Yes, exactly. But let's still keep this in our minds in the background, and improve this section in the future if anyone finds neat way.

alamb · 2023-01-20T13:53:07Z

datafusion/common/src/utils.rs

+/// rows (`item_columns`) via a linear scan. It assumes that `item_columns` is sorted
+/// according to `sort_options` and returns the insertion index of `target`.
+/// Template argument `SIDE` being `true`/`false` means left/right insertion.
+pub fn linear_search<const SIDE: bool>(


I didn't see linear_search used anywhere in this PR other than tests -- I wonder if the intention was to call it in datafusion/physical-expr/src/window/window_frame_state.rs ? It looks like that currently has an inlined version of this function 🤔

To make it compatible with bisect api, we add two versions; where one version takes sort_options: &[SortOptions], other version takes compare_fn: F. We wanted to add both versions in case anyone need them. In terms of functionality first version is not necessary. However, since constructing comparator function from SortOptions is a bit cumbersome, we wanted to have that version also.

Yes, so the same logic has two drivers: One with a comparison function, one with SortOptions. We currently use the former, but also anticipate to use the latter in the near future (we plan a follow-up of this PR for GROUPS mode). As @mustafasrepo mentions, it also brings both search APIs in line, which is good too.

My hesitation was that the tests are for linear_search -- the fact that you plan to use it in the future makes sense to me

ozankabak · 2023-01-20T16:37:13Z

Thank you for carefully reviewing @alamb. We will consider further optimizing by leveraging RowFormat in a follow-on PR. As @mustafasrepo mentions, it is not obvious to us at this time how we can utilize LexicographicalComparator directly, but if we figure out a way, we will make another follow-on PR for that too.

ursabot · 2023-01-20T22:01:50Z

Benchmark runs are scheduled for baseline = 92d0a05 and contender = b71cae8. b71cae8 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

mustafasrepo and others added 6 commits January 17, 2023 17:28

add naive linear search

edae175

Add last range to decrease search size

e738e9f

minor changes

970edf3

add low, high arguments

a92a7e2

Go back to old API, improve comments, refactors

0861739

use util function

849c773

github-actions bot added the physical-expr Physical Expressions label Jan 19, 2023

mustafasrepo changed the title ~~Feature/range linear~~ Add support for linear range calculation Jan 19, 2023

avantgardnerio approved these changes Jan 19, 2023

View reviewed changes

avantgardnerio requested a review from alamb January 19, 2023 20:31

alamb changed the title ~~Add support for linear range calculation~~ Add support for linear range calculation in WINDOW functions Jan 20, 2023

alamb reviewed Jan 20, 2023

View reviewed changes

alamb approved these changes Jan 20, 2023

View reviewed changes

alamb merged commit b71cae8 into apache:master Jan 20, 2023

mustafasrepo deleted the feature/range_linear branch February 10, 2023 06:54

mustafasrepo mentioned this pull request Feb 15, 2023

Linear search support for Window Group queries #5286

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for linear range calculation in WINDOW functions #4989

Add support for linear range calculation in WINDOW functions #4989

mustafasrepo commented Jan 19, 2023 •

edited

Loading

ozankabak commented Jan 19, 2023 •

edited

Loading

avantgardnerio commented Jan 19, 2023

avantgardnerio left a comment

avantgardnerio Jan 19, 2023

ozankabak Jan 19, 2023

alamb Jan 20, 2023

avantgardnerio Jan 19, 2023

ozankabak Jan 19, 2023

alamb commented Jan 19, 2023

alamb left a comment

alamb Jan 20, 2023

alamb Jan 20, 2023

mustafasrepo Jan 20, 2023

alamb Jan 20, 2023

alamb Jan 20, 2023

alamb Jan 20, 2023

ozankabak Jan 20, 2023 •

edited

Loading

alamb Jan 20, 2023

ozankabak Jan 20, 2023

alamb Jan 20, 2023

mustafasrepo Jan 20, 2023 •

edited

Loading

ozankabak Jan 20, 2023

alamb Jan 20, 2023

ozankabak commented Jan 20, 2023 •

edited

Loading

ursabot commented Jan 20, 2023

Add support for linear range calculation in WINDOW functions #4989

Add support for linear range calculation in WINDOW functions #4989

Conversation

mustafasrepo commented Jan 19, 2023 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

ozankabak commented Jan 19, 2023 • edited Loading

avantgardnerio commented Jan 19, 2023

avantgardnerio left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alamb commented Jan 19, 2023

alamb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ozankabak Jan 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mustafasrepo Jan 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ozankabak commented Jan 20, 2023 • edited Loading

ursabot commented Jan 20, 2023

mustafasrepo commented Jan 19, 2023 •

edited

Loading

ozankabak commented Jan 19, 2023 •

edited

Loading

ozankabak Jan 20, 2023 •

edited

Loading

mustafasrepo Jan 20, 2023 •

edited

Loading

ozankabak commented Jan 20, 2023 •

edited

Loading