PERF: axis=1 reductions with EA dtypes #54341

lukemanley · 2023-08-01T02:51:28Z

closes PERF: pd.BooleanDtype in row operations is 2000000 times slower #52016
closes PERF: DataFrame.all(axis="columns") orders of magnitude slower for bool[pyarrow] #54389
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/v2.1.0.rst file if fixing a bug or adding a new feature.

cc @rhshadrach - this includes the test you wrote in #51923 (with a few edits) as it looks like axis=1 reductions with EA dtypes are not well tested.

This PR avoids special-casing the internal EA types (although special-casing might allow for further optimization).

xref: #51474

Timings on a slow laptop:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(10000, 4), dtype="float64[pyarrow]")
%timeit df.sum(axis=1)

# 3.79 s ± 137 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)      -> main
# 1.35 s ± 40.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)     -> 2.0.3
# 250 ms ± 53 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)       -> 1.5.3
# 5.42 ms ± 306 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  -> PR

df = pd.DataFrame(np.random.randn(10000, 4), dtype="Float64")
%timeit df.sum(axis=1)

# 1.85 s ± 44.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)     -> main
# 1.2 s ± 58.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)      -> 2.0.3
# 16.4 ms ± 745 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  -> 1.5.3
# 5.15 ms ± 286 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  -> PR

rhshadrach · 2023-08-01T20:36:53Z

Wow - this is quite clever! On my AMD Ryzen 9 5950X machine, I get

df = pd.DataFrame(np.random.randn(10000, 4), dtype="float64[pyarrow]")
%timeit df.sum(axis=1)
698 ms ± 2.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  <-- main
929 µs ± 4.33 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)  <-- PR

df = pd.DataFrame(np.random.randn(10000, 4), dtype="Float64")
%timeit df.sum(axis=1)
401 ms ± 1.45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  <-- main
1.16 ms ± 2.52 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)  <-- PR

Also, for wide inputs, there appears to not be a slowdown

df = pd.DataFrame(np.random.randn(4, 10000), dtype="float64[pyarrow]")
%timeit df.sum(axis=1)
29 ms ± 157 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  <-- main
28.2 ms ± 440 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  <-- PR

df = pd.DataFrame(np.random.randn(4, 10000), dtype="Float64")
%timeit df.sum(axis=1)
22.9 ms ± 48.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) <-- main
22.4 ms ± 273 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) <-- PR

Edit:

I also ran ASVs for frame_methods and stat_ops. frame_methods showed no change, stat_ops:

       before           after         ratio
     [c93e8034]       [8335a189]
     <main>           <perf-axis1-ea-reductions>
-      4.25±0.03s       10.2±0.3ms     0.00  stat_ops.FrameOps.time_op('sum', 'Int64', 1)
-         4.24±0s       10.0±0.3ms     0.00  stat_ops.FrameOps.time_op('prod', 'Int64', 1)
-      6.85±0.06s       13.0±0.3ms     0.00  stat_ops.FrameOps.time_op('skew', 'Int64', 1)
-      7.66±0.03s       13.7±0.3ms     0.00  stat_ops.FrameOps.time_op('median', 'Int64', 1)
-      7.46±0.04s       10.6±0.3ms     0.00  stat_ops.FrameOps.time_op('var', 'Int64', 1)
-      7.61±0.04s       10.8±0.3ms     0.00  stat_ops.FrameOps.time_op('std', 'Int64', 1)

lukemanley · 2023-08-01T22:37:57Z

I also ran ASVs for frame_methods and stat_ops. frame_methods showed no change

isn't that showing a huge change, seconds -> milliseconds?

Edit: nevermind, I see the "no change" comment was for frame_methods. I'm not sure if any of the benchmarks in frame_methods would be impacted by this but it also looks like frame_methods generally does not measure EA types.

rhshadrach · 2023-08-02T00:42:58Z

Right - I wasn't sure where axis=1 ops might show up (haven't thoroughly looked through the benchmarks); these seemed to be the most likely places.

pandas/core/arrays/masked.py

rhshadrach · 2023-08-02T01:17:31Z

pandas/core/groupby/groupby.py

+                if how in ["any", "all"] and isinstance(values, SparseArray):
+                    pass


It looks like this is a general issue with SparseArray any/all so really independent of this PR, is that right? I'm thinking this should be fixed in SparseArray itself rather than in groupby code. Would it be okay to xfail any relevant tests?

Ah, I see the issue with my request above; this would make axis=1 fail for SparseArray whereas it didn't before. I would be okay opening up an issue.

sure - by opening an issue, do you mean xfail for now as part of this PR? or open an issue and address that first before this?

Leave this PR as-is; open an issue to cleanup after this is merged (before is okay too)

ah - got it. thanks

pandas/tests/frame/test_reductions.py

…tions

jbrockmendel · 2023-08-02T16:45:25Z

pandas/core/frame.py

+                    nrows, ncols = df.shape
+                    row_index = np.tile(np.arange(nrows), ncols)
+                    col_index = np.repeat(np.arange(ncols), nrows)
+                    ser = Series(arr, index=col_index)


can you add comments on what is going on here. i think i get it, but it could be non-obvious to someone who doesnt know e.g. groupby.agg

updated with comments

jbrockmendel · 2023-08-02T16:50:35Z

how would this compare to using reduce_axis_1 approach? i think that would avoid copies/allocations. agreed this is clever, but if theres a way to avoid depending on the groupby code id prefer it

lukemanley · 2023-08-02T21:43:01Z

how would this compare to using reduce_axis_1 approach? i think that would avoid copies/allocations. agreed this is clever, but if theres a way to avoid depending on the groupby code id prefer it

I might be missing something but my read of _reduce_axis1 is that it would only provide a benefit for 2D arrays whereas this specifically targets EA (1D) arrays.

…tions

rhshadrach · 2023-08-03T01:20:53Z

@lukemanley - the rough idea of _reduce_axis_1 is to do df['a'] + df['b'] instead of transpose and sum along axis 0. This will help EAs but not 2d blocks (which have a fast transpose).

@jbrockmendel - I think looking into _reduce_axis_1 approach is a good idea, but it won't work for reductions that don't have an online algorithm - e.g. median.

lukemanley · 2023-08-03T03:49:07Z

The additional copy required to construct the 1D array would be nice to avoid. I took a quick look at using _reduce_axis1 for sum and it looks promising - comparable performance to this PR and avoids the copy. As @rhshadrach points out it would only be an option for a subset of reduction types. Any thoughts on what to do with the other reductions?

Somewhat surprising is that with this current PR, axis=1 reductions seem to be faster than axis=0 reductions for EA dtypes:

import pandas as pd
import numpy as np

data = np.random.randn(1000, 1000)

df = pd.DataFrame(data, dtype="Float64")
%timeit df.sum(axis=0)
%timeit df.sum(axis=1)
# 150 ms ± 5.45 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# 62.7 ms ± 3.15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

df = pd.DataFrame(data, dtype="float64[pyarrow]")
%timeit df.sum(axis=0)
%timeit df.sum(axis=1)
# 301 ms ± 38.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# 63.6 ms ± 4.49 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

jbrockmendel · 2023-08-03T16:27:52Z

Any thoughts on what to do with the other reductions?

median comes to mind. Are there others?

At some point for #53261 I'm hoping to implement chunked-friendly implementations of the groupby reductions. I think that combined with the trick in this PR could get the best of both worlds avoiding the copy.

jbrockmendel · 2023-08-03T16:28:48Z

pandas/core/frame.py

@@ -11083,6 +11083,25 @@ def _get_data() -> DataFrame:
                ).iloc[:0]
                result.index = df.index
                return result
+
+            if df.shape[1] and name != "kurt":


IIUC kurt is excluded here bc GroupBy doesnt support it. Can you comment to that effect

added a comment here

…tions

rhshadrach · 2023-08-04T02:14:24Z

Any thoughts on what to do with the other reductions?

median comes to mind. Are there others?

It looks like just nunique and quantile.

rhshadrach · 2023-08-04T02:23:55Z

Somewhat surprising is that with this current PR, axis=1 reductions seem to be faster than axis=0 reductions for EA dtypes:

It appears as if axis=0 has more overhead; I don't believe the results hold up for size e.g. (10000, 10000).

jbrockmendel · 2023-08-04T22:19:49Z

It appears as if axis=0 has more overhead; I don't believe the results hold up for size e.g. (10000, 10000).

That seems reasonable. I'd be interested in seeing profiling results though!

rhshadrach · 2023-08-05T16:44:18Z

It appears as if axis=0 has more overhead; I don't believe the results hold up for size e.g. (10000, 10000).

That seems reasonable. I'd be interested in seeing profiling results though!

On my slower laptop now:

data = np.random.randn(10000, 10000)

df = pd.DataFrame(data, dtype="Float64")
%timeit df.sum(axis=0)
%timeit df.sum(axis=1)
# 1.02 s ± 5.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# 2.62 s ± 46.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

df = pd.DataFrame(data, dtype="float64[pyarrow]")
%timeit df.sum(axis=0)
%timeit df.sum(axis=1)
# 1.24 s ± 7.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# 2.55 s ± 5.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

rhshadrach

lgtm; cc @jbrockmendel

rhshadrach · 2023-08-05T16:48:48Z

I would be okay with saying this closes the last two of the three issues mentioned in the OP, leaving the _reduce_axis_1 open for further improvements.

phofl · 2023-08-09T12:48:54Z

pandas/core/frame.py

+                    nrows, ncols = df.shape
+                    row_index = np.tile(np.arange(nrows), ncols)
+                    col_index = np.repeat(np.arange(ncols), nrows)
+                    ser = Series(arr, index=col_index)


Can you do copy=False here for CoW?

updated. thanks

…tions

rhshadrach · 2023-08-11T07:27:35Z

@lukemanley - there's a whatsnew conflict here

…tions

rhshadrach · 2023-08-13T12:47:47Z

Thanks @lukemanley - very nice!

…types) (#54528) Backport PR #54341: PERF: axis=1 reductions with EA dtypes Co-authored-by: Luke Manley <[email protected]>

topper-123 · 2023-08-15T16:11:42Z

Yeah, very nice indeed 👍 .

jbrockmendel · 2023-08-15T16:44:14Z

might also close #28487? and less likely but related #36123?

lukemanley · 2023-08-16T09:51:55Z

might also close #28487? and less likely but related #36123?

doesn't look like it closes either of these

PERF: axis=1 reductions with EA dtypes

0364feb

lukemanley added Performance Memory or execution speed performance ExtensionArray Extending pandas with custom dtypes or arrays. Reduction Operations sum, mean, min, max, etc. labels Aug 1, 2023

lukemanley added 2 commits July 31, 2023 22:52

whatsnew

114c764

fixes

8335a18

lukemanley requested a review from rhshadrach as a code owner August 1, 2023 10:50

rhshadrach requested changes Aug 2, 2023

View reviewed changes

rhshadrach requested review from jorisvandenbossche, jbrockmendel and topper-123 August 2, 2023 01:18

lukemanley added 4 commits August 1, 2023 21:48

updates

e51fb7e

add TODO note

bb37fe1

Merge remote-tracking branch 'upstream/main' into perf-axis1-ea-reduc…

3f7553b

…tions

Merge remote-tracking branch 'upstream/main' into perf-axis1-ea-reduc…

66ac02b

…tions

jbrockmendel reviewed Aug 2, 2023

View reviewed changes

Merge remote-tracking branch 'upstream/main' into perf-axis1-ea-reduc…

6088110

…tions

jbrockmendel reviewed Aug 3, 2023

View reviewed changes

lukemanley added 2 commits August 3, 2023 18:43

Merge remote-tracking branch 'upstream/main' into perf-axis1-ea-reduc…

f0eaa29

…tions

add comments

266b2af

rhshadrach approved these changes Aug 5, 2023

View reviewed changes

rhshadrach added this to the 2.1 milestone Aug 5, 2023

phofl reviewed Aug 9, 2023

View reviewed changes

lukemanley added 2 commits August 9, 2023 08:57

add copy=False

279766d

Merge remote-tracking branch 'upstream/main' into perf-axis1-ea-reduc…

826711e

…tions

Merge remote-tracking branch 'upstream/main' into perf-axis1-ea-reduc…

3b31fbb

…tions

rhshadrach merged commit 19c8a4a into pandas-dev:main Aug 13, 2023

meeseeksmachine mentioned this pull request Aug 13, 2023

Backport PR #54341 on branch 2.1.x (PERF: axis=1 reductions with EA dtypes) #54528

Merged

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Aug 13, 2023

Backport PR pandas-dev#54341: PERF: axis=1 reductions with EA dtypes

7213ad4

rhshadrach pushed a commit that referenced this pull request Aug 13, 2023

Backport PR #54341 on branch 2.1.x (PERF: axis=1 reductions with EA d…

08edd64

…types) (#54528) Backport PR #54341: PERF: axis=1 reductions with EA dtypes Co-authored-by: Luke Manley <[email protected]>

mroeschke pushed a commit to mroeschke/pandas that referenced this pull request Aug 18, 2023

PERF: axis=1 reductions with EA dtypes (pandas-dev#54341)

28e0366

jbrockmendel mentioned this pull request Aug 28, 2023

REF: Reductions #53261

Open

lukemanley deleted the perf-axis1-ea-reductions branch September 6, 2023 00:54

Alexia-I mentioned this pull request Jan 4, 2024

PERF: pd.BooleanDtype in row operations is 2000000 times slower #52016

Closed

3 tasks

rhshadrach mentioned this pull request Mar 26, 2024

BUG: Behaviour of sum/mean on sparse boolean arrays changed between 1.5.3 and pandas 2.2 #58015

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: axis=1 reductions with EA dtypes #54341

PERF: axis=1 reductions with EA dtypes #54341

lukemanley commented Aug 1, 2023 •

edited

Loading

rhshadrach commented Aug 1, 2023 •

edited

Loading

lukemanley commented Aug 1, 2023 •

edited

Loading

rhshadrach commented Aug 2, 2023

rhshadrach Aug 2, 2023

rhshadrach Aug 2, 2023

lukemanley Aug 2, 2023

rhshadrach Aug 2, 2023

lukemanley Aug 2, 2023

jbrockmendel Aug 2, 2023

lukemanley Aug 3, 2023

jbrockmendel commented Aug 2, 2023

lukemanley commented Aug 2, 2023

rhshadrach commented Aug 3, 2023

lukemanley commented Aug 3, 2023

jbrockmendel commented Aug 3, 2023

jbrockmendel Aug 3, 2023

lukemanley Aug 3, 2023

rhshadrach commented Aug 4, 2023

rhshadrach commented Aug 4, 2023

jbrockmendel commented Aug 4, 2023

rhshadrach commented Aug 5, 2023 •

edited

Loading

rhshadrach left a comment

rhshadrach commented Aug 5, 2023 •

edited

Loading

phofl Aug 9, 2023

lukemanley Aug 9, 2023

rhshadrach commented Aug 11, 2023

rhshadrach commented Aug 13, 2023

topper-123 commented Aug 15, 2023

jbrockmendel commented Aug 15, 2023

lukemanley commented Aug 16, 2023

		if how in ["any", "all"] and isinstance(values, SparseArray):
		pass

PERF: axis=1 reductions with EA dtypes #54341

PERF: axis=1 reductions with EA dtypes #54341

Conversation

lukemanley commented Aug 1, 2023 • edited Loading

rhshadrach commented Aug 1, 2023 • edited Loading

lukemanley commented Aug 1, 2023 • edited Loading

rhshadrach commented Aug 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Aug 2, 2023

lukemanley commented Aug 2, 2023

rhshadrach commented Aug 3, 2023

lukemanley commented Aug 3, 2023

jbrockmendel commented Aug 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhshadrach commented Aug 4, 2023

rhshadrach commented Aug 4, 2023

jbrockmendel commented Aug 4, 2023

rhshadrach commented Aug 5, 2023 • edited Loading

rhshadrach left a comment

Choose a reason for hiding this comment

rhshadrach commented Aug 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhshadrach commented Aug 11, 2023

rhshadrach commented Aug 13, 2023

topper-123 commented Aug 15, 2023

jbrockmendel commented Aug 15, 2023

lukemanley commented Aug 16, 2023

lukemanley commented Aug 1, 2023 •

edited

Loading

rhshadrach commented Aug 1, 2023 •

edited

Loading

lukemanley commented Aug 1, 2023 •

edited

Loading

rhshadrach commented Aug 5, 2023 •

edited

Loading

rhshadrach commented Aug 5, 2023 •

edited

Loading