Optimize `gcd` for array and scalar case by avoiding `make_scalar_function` where has unnecessary conversion between scalar and array #14834

jayzhan211 · 2025-02-23T06:21:46Z

Which issue does this PR close?

Closes #.

Rationale for this change

make_scalar_function convert scalar to array and call function kernel. However, we don't need to convert to array and can directly compute on it. This improve the gcd performance in array vs scalar case

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

jayzhan211 · 2025-02-23T06:22:00Z

Gnuplot not found, using plotters backend
Benchmarking gcd both array: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.5s, enable flat sampling, or reduce sample count to 50.
gcd both array          time:   [1.4799 ms 1.4826 ms 1.4853 ms]
                        change: [+1.1198% +1.3952% +1.6734%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Benchmarking gcd array and scalar: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50.
gcd array and scalar    time:   [1.7456 ms 1.7486 ms 1.7519 ms]
                        change: [-4.8528% -4.6518% -4.4540%] (p = 0.00 < 0.05)
                        Performance has improved.

gcd both scalar         time:   [23.373 ns 23.431 ns 23.487 ns]
                        change: [-93.342% -93.306% -93.272%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  5 (5.00%) low severe
  5 (5.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

alamb

Thanks @jayzhan211 -- look like a nice improvement to me

alamb · 2025-02-25T11:32:33Z

datafusion/functions/src/math/gcd.rs

+fn compute_gcd_for_arrays(a: &ArrayRef, b: &ArrayRef) -> Result<ColumnarValue> {
+    let result: Result<Int64Array> = a
+        .as_primitive::<Int64Type>()
+        .iter()
+        .zip(b.as_primitive::<Int64Type>().iter())
+        .map(|(a, b)| match (a, b) {
+            (Some(a), Some(b)) => Ok(Some(compute_gcd(a, b)?)),
+            _ => Ok(None),
+        })
+        .collect();
+
+    result.map(|arr| ColumnarValue::Array(Arc::new(arr) as ArrayRef))
+}


I think you can use try_binary and make it even faster:

Suggested change

fn compute_gcd_for_arrays(a: &ArrayRef, b: &ArrayRef) -> Result<ColumnarValue> {

let result: Result<Int64Array> = a

.as_primitive::<Int64Type>()

.iter()

.zip(b.as_primitive::<Int64Type>().iter())

.map(|(a, b)| match (a, b) {

(Some(a), Some(b)) => Ok(Some(compute_gcd(a, b)?)),

_ => Ok(None),

})

.collect();

result.map(|arr| ColumnarValue::Array(Arc::new(arr) as ArrayRef))

}

fn compute_gcd_for_arrays(a: &ArrayRef, b: &ArrayRef) -> Result<ColumnarValue> {

let result: Result<Int64Array> = a

.as_primitive::<Int64Type>()

.iter()

.zip(b.as_primitive::<Int64Type>().iter())

.map(|(a, b)| match (a, b) {

(Some(a), Some(b)) => Ok(Some(compute_gcd(a, b)?)),

_ => Ok(None),

})

.collect();

result.map(|arr| ColumnarValue::Array(Arc::new(arr) as ArrayRef))

}

I'll run some quick numbers with your new benchmark.

With my proposal the two array version seems to go 25% faster than this PR. Here is a PR with that improvement:

Use try_binary to make gcd even faster jayzhan211/datafusion#5

Gnuplot not found, using plotters backend Benchmarking gcd both array: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.9s, enable flat sampling, or reduce sample count to 60. gcd both array time: [1.1940 ms 1.2007 ms 1.2081 ms] change: [-27.574% -27.080% -26.470%] (p = 0.00 < 0.05) Performance has improved. Found 9 outliers among 100 measurements (9.00%) 6 (6.00%) high mild 3 (3.00%) high severe gcd array and scalar time: [1.9644 ms 1.9705 ms 1.9781 ms] change: [-1.3293% -0.9077% -0.4670%] (p = 0.00 < 0.05) Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe gcd both scalar time: [26.094 ns 26.253 ns 26.467 ns] change: [-1.1099% -0.6987% -0.2362%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe

Use try_binary to make gcd even faster

jayzhan211 · 2025-02-25T12:08:05Z

datafusion/functions/src/math/gcd.rs

+            b.len()
+        );
+    }
+    try_binary(a, b, compute_gcd)


try_binary actually did the length check

jayzhan211 · 2025-02-25T14:39:53Z

Thanks @alamb

optimize gcd

ed1d0cd

github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Feb 23, 2025

jayzhan211 marked this pull request as draft February 23, 2025 06:24

jayzhan211 added 2 commits February 23, 2025 14:27

fmt

78d4753

add feature

1394240

jayzhan211 mentioned this pull request Feb 23, 2025

Review the need of make_scalar_function for functions #14835

Open

jayzhan211 marked this pull request as ready for review February 23, 2025 07:14

alamb approved these changes Feb 25, 2025

View reviewed changes

Use try_binary to make gcd even faster

e8d36eb

alamb mentioned this pull request Feb 25, 2025

Use try_binary to make gcd even faster jayzhan211/datafusion#5

Merged

Merge pull request #5 from alamb/alamb/opt-gcd-moar

1caec80

Use try_binary to make gcd even faster

jayzhan211 commented Feb 25, 2025

View reviewed changes

rm length check

7e28c67

jayzhan211 merged commit f1f6e5e into apache:main Feb 25, 2025
26 checks passed

jayzhan211 deleted the opt-gcd branch February 25, 2025 14:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `gcd` for array and scalar case by avoiding `make_scalar_function` where has unnecessary conversion between scalar and array #14834

Optimize `gcd` for array and scalar case by avoiding `make_scalar_function` where has unnecessary conversion between scalar and array #14834

jayzhan211 commented Feb 23, 2025

jayzhan211 commented Feb 23, 2025

alamb left a comment

alamb Feb 25, 2025

alamb Feb 25, 2025

jayzhan211 Feb 25, 2025

jayzhan211 commented Feb 25, 2025

Optimize gcd for array and scalar case by avoiding make_scalar_function where has unnecessary conversion between scalar and array #14834

Optimize gcd for array and scalar case by avoiding make_scalar_function where has unnecessary conversion between scalar and array #14834

Conversation

jayzhan211 commented Feb 23, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

jayzhan211 commented Feb 23, 2025

alamb left a comment

Choose a reason for hiding this comment

alamb Feb 25, 2025

Choose a reason for hiding this comment

alamb Feb 25, 2025

Choose a reason for hiding this comment

jayzhan211 Feb 25, 2025

Choose a reason for hiding this comment

jayzhan211 commented Feb 25, 2025

Optimize `gcd` for array and scalar case by avoiding `make_scalar_function` where has unnecessary conversion between scalar and array #14834

Optimize `gcd` for array and scalar case by avoiding `make_scalar_function` where has unnecessary conversion between scalar and array #14834