-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Support ExtensionArray operators via a mixin #21261
Changes from 2 commits
5b0ebc7
d7596c6
7f2b0a1
ec96841
a07bb49
1d7b2b3
7bad559
dfcda3b
aaaa8fd
4bcf978
f958d7b
ef83c3a
41dc5ca
be6656b
a0f503c
700d75b
87e8f55
97bd291
8fc93e4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
from .base import ExtensionArray # noqa | ||
from .base import ExtensionArray, ExtensionOpsMixin # noqa | ||
from .categorical import Categorical # noqa |
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -9,6 +9,11 @@ | |||
|
||||
from pandas.errors import AbstractMethodError | ||||
from pandas.compat.numpy import function as nv | ||||
from pandas.compat import set_function_name, PY3 | ||||
import pandas.core.common as com | ||||
from pandas.core.dtypes.common import ( | ||||
is_extension_array_dtype, | ||||
is_list_like) | ||||
|
||||
_not_implemented_message = "{} does not implement {}." | ||||
|
||||
|
@@ -610,3 +615,91 @@ def _ndarray_values(self): | |||
used for interacting with our indexers. | ||||
""" | ||||
return np.array(self) | ||||
|
||||
|
||||
def ExtensionOpsMixin(include_arith_ops, include_logic_ops): | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a compelling reason to make a factory for this instead of just making two mixin classes?
? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel I am sharing the code of the operator between arithmetic and comparison operators. Alternatively, I could just create a base mixin class, and have If you guys want me to make that change, let me know. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel I implemented your suggestion. It does look cleaner. It's in the latest commit |
||||
"""A mixin factory for creating default arithmetic and logical operators, | ||||
which are based on the underlying dtype backing the ExtensionArray | ||||
|
||||
Parameters | ||||
---------- | ||||
include_arith_ops : boolean indicating whether arithmetic ops should be | ||||
created | ||||
include_logic_ops : boolean indicating whether logical ops should be | ||||
created | ||||
|
||||
Returns | ||||
------- | ||||
A mixin class that has the associated operators defined. | ||||
|
||||
Usage | ||||
------ | ||||
If you have defined a subclass MyClass(ExtensionArray), then | ||||
use MyClass(ExtensionArray, ExtensionOpsMixin(True, True)) to | ||||
get both the arithmetic and logical operators | ||||
""" | ||||
class _ExtensionOpsMixin(object): | ||||
pass | ||||
|
||||
def create_method(op_name): | ||||
def _binop(self, other): | ||||
def convert_values(parm): | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. minor detail, but I find 'values' (or 'param') is clearer than 'parm' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed |
||||
if isinstance(parm, ExtensionArray): | ||||
ovalues = list(parm) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is it needed to convert to a list? ExtensionArrays support iterating through them (which returns the scalars) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed |
||||
elif is_extension_array_dtype(parm): | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A Series is also 'list-like', so I think this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||
ovalues = parm.values | ||||
elif is_list_like(parm): | ||||
ovalues = parm | ||||
else: # Assume its an object | ||||
ovalues = [parm] * len(self) | ||||
return ovalues | ||||
lvalues = convert_values(self) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should not be needed as the calling object will always be already an ExtensionArray? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Fixed |
||||
rvalues = convert_values(other) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should probably do alignment as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't need to, for a few reasons:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, OK, this is also tested? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche Yes, this is tested because I am using the same tests that are used for general operators on Series, which tests things that are misaligned, etc. See As an aside, doing it this way uncovered the issues with |
||||
|
||||
# Get the method for each object. | ||||
def callfunc(a, b): | ||||
f = getattr(a, op_name, None) | ||||
if f is not None: | ||||
return f(b) | ||||
else: | ||||
return NotImplemented | ||||
res = [callfunc(a, b) for (a, b) in zip(lvalues, rvalues)] | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we would do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. similar as pandas/pandas/core/indexes/category.py Line 793 in c85ab08
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The check for Not sure what you are referring to when you wrote "similar as" the code in `pandas/core/indexes/category.py". That implementation is not doing things element by element, which is what I'm doing here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are you doing it this way, dynamically generating this? there is a limited set of ops, just write them out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, but if you do
I simply referenced it because it does something very similar (generating the comparison methods), and does this by using the So what I mean is the following. You basically do
which translates to doing
which translates to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@jreback are you speaking about this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is what I mean. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback This is a bit of the "what comes first" discussion held at #20889 (comment) with @jorisvandenbossche . My goal with the mixin is that if you already have the In 21d76b3 you are creating the basis for a more general operator support. I could make the mixin use the more general framework you've proposed, but I guess that should be merged in before I do the mixin. (That's the "what comes first" issue) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche My latest commit does as you suggest. This removes the tests for @jreback My latest commit uses a similar pattern to what you had in the referenced commit, just creating an operator one by one instead of using a loop There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Orthogonal to @jorisvandenbossche 's comment, instead of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That doesn't seem to work properly?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, that's because it does a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ahh, I guess that explains the "bin" part of "vec_binop". So it looks like a) the original suggestion is incorrect and b) it would probably be easy to implement and may improve perf. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought "binary" meant that it is an op between 2 values, not that the result in boolean. So I think the naming is just confusing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed on both counts. |
||||
|
||||
# We can't use (NotImplemented in res) because the | ||||
# results might be objects that have overridden __eq__ | ||||
if any(isinstance(r, type(NotImplemented)) for r in res): | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand the comment above. Is there a scenario in which There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My comment was about using But now I have removed that in my working version based on @jorisvandenbossche suggestion to use the op rather than the op_name |
||||
msg = "invalid operation {opn} between {one} and {two}" | ||||
raise TypeError(msg.format(opn=op_name, | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I would not care catching the error.
which is already a nice error (and IMO even clearer, because it does not contain references to "Extension" which normal users don't necessarily are familiar with) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Fixed. |
||||
one=type(lvalues), | ||||
two=type(rvalues))) | ||||
|
||||
res_values = com._values_from_object(res) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this does anything for a list (it just passes it through), so this can be removed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I removed it and things still work. Had copied this from elsewhere. |
||||
|
||||
try: | ||||
res_values = self._from_sequence(res_values) | ||||
except TypeError: | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. would it makes sense to have a separate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see your point. But if someone else had the same use case as me, where the comparison ops need to return an object rather than a boolean, then they'd have to know the workaround you suggest, and so then the question is whether we document it. I'm leaving this as is for now, but will accept if you really want me to change it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One reason to split it up would be to avoid the extra overhead to try to convert it to an ExtensionArray in case of the boolean ops (although for a good implementation this probably should not be too much overhead ..) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I've done the split for the next commit. I added a parameter to |
||||
pass | ||||
|
||||
return res_values | ||||
|
||||
name = '__{name}__'.format(name=op_name) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are you defining create_method as anything other than NotImplementedError this is way too opionated - if u want to define a function to do this and an author can use this ok maybe if u want to create an ExtensionOpsMixinForScalars (long name though) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please take a closer look at the diff if you give such an opinionated (pun intended :-)) comment
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche maybe I wasn't clear.
I think the ops implementation is so important that this must be an explicit choice. The point of the mixin is to avoid the boilerplate of writing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Again, this is what this PR is doing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jreback @jorisvandenbossche Maybe the naming is misleading. So let me clarify the intent of the PR. The goal here is that if someone extends An example is The use of these 3 classes ( __add__ = ExtensionOpsBase.create_method(operator.add) IMHO, it seems that @jorisvandenbossche understands what I'm trying to do, but @jreback may not fully understand it. If you think there is a better name for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche I've tried a variety of implementations without the base class, and couldn't get it to work. I'll rename as you suggest. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. renaming looks ok to me There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Next commit will use the renaming There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed on the naming conventions. Maybe the NotImplemented versions of the methods belong on the base There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel The NotImplemented versions are there by default for any class, so I don't think we need to do anything. In other words, if you write a class |
||||
return set_function_name(_binop, name, _ExtensionOpsMixin) | ||||
|
||||
if include_arith_ops: | ||||
arithops = ['__add__', '__radd__', '__sub__', '__rsub__', '__mul__', | ||||
'__rmul__', '__pow__', '__rpow__', '__mod__', '__rmod__', | ||||
'__floordiv__', '__rfloordiv__', '__truediv__', | ||||
'__rtruediv__', '__divmod__', '__rdivmod__'] | ||||
if not PY3: | ||||
arithops.extend(['__div__', '__rdiv__']) | ||||
|
||||
for op_name in arithops: | ||||
setattr(_ExtensionOpsMixin, op_name, create_method(op_name)) | ||||
|
||||
if include_logic_ops: | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can these be called There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||
logicops = ['__eq__', '__ne__', '__lt__', '__gt__', | ||||
'__le__', '__ge__'] | ||||
for op_name in logicops: | ||||
setattr(_ExtensionOpsMixin, op_name, create_method(op_name)) | ||||
|
||||
return _ExtensionOpsMixin |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2972,16 +2972,20 @@ def get_value(self, series, key): | |
# use this, e.g. DatetimeIndex | ||
s = getattr(series, '_values', None) | ||
if isinstance(s, (ExtensionArray, Index)) and is_scalar(key): | ||
# GH 20825 | ||
# GH 20882, 21257 | ||
# Unify Index and ExtensionArray treatment | ||
# First try to convert the key to a location | ||
# If that fails, see if key is an integer, and | ||
# If that fails, raise a KeyError if an integer | ||
# index, otherwise, see if key is an integer, and | ||
# try that | ||
try: | ||
iloc = self.get_loc(key) | ||
return s[iloc] | ||
except KeyError: | ||
if is_integer(key): | ||
if (len(self) > 0 and | ||
self.inferred_type in ['integer', 'boolean']): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What makes this necessary? I'm not clear on how it relates to the Mixin classes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For this you can comment on the relevant other PR (#21260) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel the other PR (#21260) was needed to make the tests here work right. |
||
raise | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pls move all non-ops code to another PR There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The same answer as above: this is needed for the test to pass untill the other PRs are merged (to say: it is already moved to separate PRs) |
||
elif is_integer(key): | ||
return s[key] | ||
|
||
s = com._values_from_object(series) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,6 +30,7 @@ | |
is_bool_dtype, | ||
is_list_like, | ||
is_scalar, | ||
is_extension_array_dtype, | ||
_ensure_object) | ||
from pandas.core.dtypes.cast import ( | ||
maybe_upcast_putmask, find_common_type, | ||
|
@@ -990,6 +991,20 @@ def _construct_divmod_result(left, result, index, name, dtype): | |
) | ||
|
||
|
||
def dispatch_to_extension_op(left, right, op_name): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this be accomplished with:
? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless we require that the ExtensionArray constructor handles objects of itself (and does not copy them), we cannot do the your example code (and currently we don't require this from extension array implementations I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough. Eventually dispatch_to_index_op is intended to give way to an EA-based dispatch. |
||
""" | ||
Assume that left is a Series backed by an ExtensionArray, | ||
apply the operator defined by op_name. | ||
""" | ||
|
||
method = getattr(left.values, op_name, None) | ||
res_values = method(right) | ||
|
||
res_name = get_op_result_name(left, right) | ||
return left._constructor(res_values, index=left.index, | ||
name=res_name) | ||
|
||
|
||
def _arith_method_SERIES(cls, op, special): | ||
""" | ||
Wrapper function for Series arithmetic operations, to avoid | ||
|
@@ -1058,6 +1073,9 @@ def wrapper(left, right): | |
raise TypeError("{typ} cannot perform the operation " | ||
"{op}".format(typ=type(left).__name__, op=str_rep)) | ||
|
||
elif is_extension_array_dtype(left): | ||
return dispatch_to_extension_op(left, right, op_name) | ||
|
||
lvalues = left.values | ||
rvalues = right | ||
if isinstance(rvalues, ABCSeries): | ||
|
@@ -1208,6 +1226,9 @@ def wrapper(self, other, axis=None): | |
return self._constructor(res_values, index=self.index, | ||
name=res_name) | ||
|
||
elif is_extension_array_dtype(self): | ||
return dispatch_to_extension_op(self, other, op_name) | ||
|
||
elif isinstance(other, ABCSeries): | ||
# By this point we have checked that self._indexed_same(other) | ||
res_values = na_op(self.values, other.values) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2196,23 +2196,22 @@ def _binop(self, other, func, level=None, fill_value=None): | |
result.name = None | ||
return result | ||
|
||
def combine(self, other, func, fill_value=np.nan): | ||
def combine(self, other, func, fill_value=None): | ||
""" | ||
Perform elementwise binary operation on two Series using given function | ||
with optional fill value when an index is missing from one Series or | ||
the other | ||
|
||
Parameters | ||
---------- | ||
other : Series or scalar value | ||
func : function | ||
Function that takes two scalars as inputs and return a scalar | ||
fill_value : scalar value | ||
|
||
The default specifies to use the appropriate NaN value for | ||
the underlying dtype of the Series | ||
Returns | ||
------- | ||
result : Series | ||
|
||
Examples | ||
-------- | ||
>>> s1 = Series([1, 2]) | ||
|
@@ -2221,26 +2220,36 @@ def combine(self, other, func, fill_value=np.nan): | |
0 0 | ||
1 2 | ||
dtype: int64 | ||
|
||
See Also | ||
-------- | ||
Series.combine_first : Combine Series values, choosing the calling | ||
Series's values first | ||
""" | ||
self_is_ext = is_extension_array_dtype(self.values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would move all of this (combine stuff) to another PR on top of this one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
if fill_value is None: | ||
fill_value = na_value_for_dtype(self.dtype, False) | ||
|
||
if isinstance(other, Series): | ||
new_index = self.index.union(other.index) | ||
new_name = ops.get_op_result_name(self, other) | ||
new_values = np.empty(len(new_index), dtype=self.dtype) | ||
for i, idx in enumerate(new_index): | ||
new_values = [] | ||
for idx in new_index: | ||
lv = self.get(idx, fill_value) | ||
rv = other.get(idx, fill_value) | ||
with np.errstate(all='ignore'): | ||
new_values[i] = func(lv, rv) | ||
new_values.append(func(lv, rv)) | ||
else: | ||
new_index = self.index | ||
with np.errstate(all='ignore'): | ||
new_values = func(self._values, other) | ||
new_values = [func(lv, other) for lv in self._values] | ||
new_name = self.name | ||
|
||
if self_is_ext and not is_categorical_dtype(self.values): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. don't use shorthands like this make this simpler, something like
much more readable There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comment that in #21183 |
||
try: | ||
new_values = self._values._from_sequence(new_values) | ||
except TypeError: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are you catching a TypeError? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is also for the other PR, but: because the result might not necessarily be an ExtensionArray. This is of course a bit an unclear area of |
||
pass | ||
|
||
return self._constructor(new_values, index=new_index, name=new_name) | ||
|
||
def combine_first(self, other): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ | |
import numpy as np | ||
|
||
import pandas as pd | ||
from pandas.core.arrays import ExtensionArray | ||
from pandas.core.arrays import ExtensionArray, ExtensionOpsMixin | ||
from pandas.core.dtypes.base import ExtensionDtype | ||
|
||
|
||
|
@@ -24,11 +24,13 @@ def construct_from_string(cls, string): | |
"'{}'".format(cls, string)) | ||
|
||
|
||
class DecimalArray(ExtensionArray): | ||
class DecimalArray(ExtensionArray, ExtensionOpsMixin(True, True)): | ||
dtype = DecimalDtype() | ||
|
||
def __init__(self, values): | ||
assert all(isinstance(v, decimal.Decimal) for v in values) | ||
for val in values: | ||
if not isinstance(val, self.dtype.type): | ||
raise TypeError | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. erorr message There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed (and also in pandas/tests/extension/json/array.py) |
||
values = np.asarray(values, dtype=object) | ||
|
||
self._data = values | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,6 +7,9 @@ | |
|
||
from pandas.tests.extension import base | ||
|
||
from pandas.tests.series.test_operators import TestSeriesOperators | ||
from pandas.util._decorators import cache_readonly | ||
|
||
from .array import DecimalDtype, DecimalArray, make_data | ||
|
||
|
||
|
@@ -183,3 +186,36 @@ def test_dataframe_constructor_with_different_dtype_raises(): | |
xpr = "Cannot coerce extension array to dtype 'int64'. " | ||
with tm.assert_raises_regex(ValueError, xpr): | ||
pd.DataFrame({"A": arr}, dtype='int64') | ||
|
||
|
||
_ts = pd.Series(DecimalArray(make_data())) | ||
|
||
|
||
class TestOperator(BaseDecimal, TestSeriesOperators): | ||
@cache_readonly | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. never do this. use a fixture There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is unavoidable, because I am subclassing the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Never mind, I redid the tests as you suggested elsewhere |
||
def ts(self): | ||
ts = _ts.copy() | ||
ts.name = 'ts' | ||
return ts | ||
|
||
def test_operators(self): | ||
def absfunc(v): | ||
if isinstance(v, pd.Series): | ||
vals = v.values | ||
return pd.Series(vals._from_sequence([abs(i) for i in vals])) | ||
else: | ||
return abs(v) | ||
context = decimal.getcontext() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what the heck is all this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am reusing the tests that are in the class There is a related issue with respect to |
||
divbyzerotrap = context.traps[decimal.DivisionByZero] | ||
invalidoptrap = context.traps[decimal.InvalidOperation] | ||
context.traps[decimal.DivisionByZero] = 0 | ||
context.traps[decimal.InvalidOperation] = 0 | ||
super(TestOperator, self).test_operators(absfunc) | ||
context.traps[decimal.DivisionByZero] = divbyzerotrap | ||
context.traps[decimal.InvalidOperation] = invalidoptrap | ||
|
||
def test_operators_corner(self): | ||
pytest.skip("Cannot add empty Series of float64 to DecimalArray") | ||
|
||
def test_divmod(self): | ||
pytest.skip("divmod not appropriate for Decimal type") |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1216,11 +1216,11 @@ def test_neg(self): | |
def test_invert(self): | ||
assert_series_equal(-(self.series < 0), ~(self.series < 0)) | ||
|
||
def test_operators(self): | ||
def test_operators(self, absfunc=np.abs): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this change related to the PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was leftover from the previous method of testing, so I will revert it back. |
||
def _check_op(series, other, op, pos_only=False, | ||
check_dtype=True): | ||
left = np.abs(series) if pos_only else series | ||
right = np.abs(other) if pos_only else other | ||
left = absfunc(series) if pos_only else series | ||
right = absfunc(other) if pos_only else other | ||
|
||
cython_or_numpy = op(left, right) | ||
python = left.combine(right, op) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,7 +30,8 @@ | |
is_categorical_dtype, | ||
is_interval_dtype, | ||
is_sequence, | ||
is_list_like) | ||
is_list_like, | ||
is_extension_array_dtype) | ||
from pandas.io.formats.printing import pprint_thing | ||
from pandas.core.algorithms import take_1d | ||
import pandas.core.common as com | ||
|
@@ -1225,6 +1226,10 @@ def assert_series_equal(left, right, check_dtype=True, | |
right = pd.IntervalIndex(right) | ||
assert_index_equal(left, right, obj='{obj}.index'.format(obj=obj)) | ||
|
||
elif (is_extension_array_dtype(left) and not is_categorical_dtype(left) and | ||
is_extension_array_dtype(right) and not is_categorical_dtype(right)): | ||
return assert_extension_array_equal(left.values, right.values) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should only do this if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I follow your logic for this. Before, we had no code that really compared two EA's for testing. However, that code doesn't work correctly if they are categorical dtypes. The |
||
|
||
else: | ||
_testing.assert_almost_equal(left.get_values(), right.get_values(), | ||
check_less_precise=check_less_precise, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a separate sub-section. pls expand this a bit, I know what you mean, but I doubt the average reader does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the note here still accurate or has the factory been changed to just mixins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've written a section for whatsnew