-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: extension ops #21191
ENH: extension ops #21191
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21191 +/- ##
==========================================
- Coverage 91.84% 91.77% -0.08%
==========================================
Files 153 153
Lines 49543 49618 +75
==========================================
+ Hits 45504 45538 +34
- Misses 4039 4080 +41
Continue to review full report at Codecov.
|
pandas/conftest.py
Outdated
""" | ||
Fixture for dunder names for common compare operations | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
cls.__div__ = cls._make_arithmetic_op(operator.div) | ||
cls.__rdiv__ = cls._make_arithmetic_op(ops.rdiv) | ||
|
||
cls.__divmod__ = cls._make_arithmetic_op(divmod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rdivmod?
pandas/core/ops.py
Outdated
@@ -1025,6 +1035,7 @@ def na_op(x, y): | |||
return result | |||
|
|||
def safe_na_op(lvalues, rvalues): | |||
# all others |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you flesh out this comment
is_extension_array_dtype(y) and not | ||
is_scalar(y)): | ||
y = x.__class__._from_sequence(y) | ||
return op(x, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to put this in the closure? It seems like this case could be handled directly on line 1061.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way the r ops are defined is not compatible with some types, pretty much if you have a
rop(et, ndarray)
will fail for some types, e.g. say ndarray is a datetime64[ns] then et is an integer like the precedence is wrong (IOW need to defer to the op here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand what you're saying, but that it isn't what I'm getting at. Let me try to rephrase:
In wrapper
below, there is currently (in the PR)
elif (is_extension_array_dtype(left) or
is_extension_array_dtype(right)):
pass
I'm suggesting that instead of pass
there, the code above (currently 1006-1015) could go there. The main reason why this might not be appropriate is if the try/except in safe_na_op
is relevant for this case.
# very well defined | ||
elif (is_extension_array_dtype(x) or | ||
is_extension_array_dtype(y)): | ||
return op(x, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a viable option to move this out of the closure to ~ line 1218, that would be more in line with the goals we've been pursuing in this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you notice categorical is already here, this is just another case, not sure how we could dispatch on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, categorical has a few characteristics that make it a difficult case in this function. I think you're right that (for now at least) part of this is in the right place. The thing to do here is to:
- take the
is_extension_array_dtype(x)
part of this condition and put it down on L1231 (right after theis_categorical_dtype(self)
block) and mimic the categorical_dtype case there - keep the
is_extension_array_dtype(y)
case as is here (should the op be reversed?) - check if this needs a
and not is_scalar(y)
like the categorical above for some corner case - consider merging the categorical_dtype blocks into this more general case
Looks reasonable, with some comments on organization in |
338ea4d
to
f6d5386
Compare
if isinstance(right, ABCDataFrame): | ||
return NotImplemented | ||
|
||
left, right = _align_method_SERIES(left, right) | ||
res_name = get_op_result_name(left, right) | ||
|
||
if is_datetime64_dtype(left) or is_datetime64tz_dtype(left): | ||
if is_categorical_dtype(left): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be made cleaner if/when we implement these ops in Categorical itself (they'll just raise the TypeError below)
@jbrockmendel note I haven't actually changed this much (IOW for your comments) yet. |
yes |
going to include directly in #21160 as much smaller now |
builds on #21185 for extension array ops
pre-cursor to #21160