-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: axis=1 reductions with EA dtypes #54341
PERF: axis=1 reductions with EA dtypes #54341
Conversation
Wow - this is quite clever! On my
Also, for wide inputs, there appears to not be a slowdown
Edit: I also ran ASVs for
|
isn't that showing a huge change, seconds -> milliseconds? Edit: nevermind, I see the "no change" comment was for |
Right - I wasn't sure where axis=1 ops might show up (haven't thoroughly looked through the benchmarks); these seemed to be the most likely places. |
if how in ["any", "all"] and isinstance(values, SparseArray): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is a general issue with SparseArray any/all so really independent of this PR, is that right? I'm thinking this should be fixed in SparseArray itself rather than in groupby code. Would it be okay to xfail any relevant tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the issue with my request above; this would make axis=1 fail for SparseArray whereas it didn't before. I would be okay opening up an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure - by opening an issue, do you mean xfail for now as part of this PR? or open an issue and address that first before this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave this PR as-is; open an issue to cleanup after this is merged (before is okay too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah - got it. thanks
pandas/core/frame.py
Outdated
nrows, ncols = df.shape | ||
row_index = np.tile(np.arange(nrows), ncols) | ||
col_index = np.repeat(np.arange(ncols), nrows) | ||
ser = Series(arr, index=col_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add comments on what is going on here. i think i get it, but it could be non-obvious to someone who doesnt know e.g. groupby.agg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated with comments
how would this compare to using reduce_axis_1 approach? i think that would avoid copies/allocations. agreed this is clever, but if theres a way to avoid depending on the groupby code id prefer it |
I might be missing something but my read of |
@lukemanley - the rough idea of @jbrockmendel - I think looking into |
The additional copy required to construct the 1D array would be nice to avoid. I took a quick look at using Somewhat surprising is that with this current PR,
|
median comes to mind. Are there others? At some point for #53261 I'm hoping to implement chunked-friendly implementations of the groupby reductions. I think that combined with the trick in this PR could get the best of both worlds avoiding the copy. |
@@ -11083,6 +11083,25 @@ def _get_data() -> DataFrame: | |||
).iloc[:0] | |||
result.index = df.index | |||
return result | |||
|
|||
if df.shape[1] and name != "kurt": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC kurt is excluded here bc GroupBy doesnt support it. Can you comment to that effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a comment here
It looks like just nunique and quantile. |
It appears as if axis=0 has more overhead; I don't believe the results hold up for size e.g. (10000, 10000). |
That seems reasonable. I'd be interested in seeing profiling results though! |
On my slower laptop now:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm; cc @jbrockmendel
I would be okay with saying this closes the last two of the three issues mentioned in the OP, leaving the |
pandas/core/frame.py
Outdated
nrows, ncols = df.shape | ||
row_index = np.tile(np.arange(nrows), ncols) | ||
col_index = np.repeat(np.arange(ncols), nrows) | ||
ser = Series(arr, index=col_index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do copy=False here for CoW?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated. thanks
@lukemanley - there's a whatsnew conflict here |
Thanks @lukemanley - very nice! |
…types) (#54528) Backport PR #54341: PERF: axis=1 reductions with EA dtypes Co-authored-by: Luke Manley <[email protected]>
Yeah, very nice indeed 👍 . |
DataFrame.all(axis="columns")
orders of magnitude slower forbool[pyarrow]
#54389doc/source/whatsnew/v2.1.0.rst
file if fixing a bug or adding a new feature.cc @rhshadrach - this includes the test you wrote in #51923 (with a few edits) as it looks like axis=1 reductions with EA dtypes are not well tested.
This PR avoids special-casing the internal EA types (although special-casing might allow for further optimization).
xref: #51474
Timings on a slow laptop: