You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have reductions implemented in nanops, _libs.groupby, and _libs.window.aggregations. We should refactor these with the following goals in mind:
Have one/fewer distinct implementations
Avoid copies, particularly in the nanops versions where we do something like values[notna(values)]
Chunked-friendliness, so that we can re-write ArrowExtensionArray._groupby_op to operate chunk-by-chunk, avoiding a copy in multi-chunk cases. (This could also be useful for hypothetical distributed EAs)
The implementation of group_skew is derived from https://www.johndcook.com/blog/skewness_kurtosis/ which includes a method for "adding" multiple RunningStats instances. Something like that could be adapted for 3).
The text was updated successfully, but these errors were encountered:
Just noting that _libs.window.aggregations should ideally keep its implementation since the sliding window aggregation is performant sensitive. I think the other reductions could be implemented in terms of the sliding windowing aggregation i.e. they would be non-overlapping windows
We have reductions implemented in nanops, _libs.groupby, and _libs.window.aggregations. We should refactor these with the following goals in mind:
values[notna(values)]
The implementation of group_skew is derived from https://www.johndcook.com/blog/skewness_kurtosis/ which includes a method for "adding" multiple RunningStats instances. Something like that could be adapted for 3).
The text was updated successfully, but these errors were encountered: