-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
separate numeric tests so we can isolate division by zero #19336
Conversation
@@ -595,77 +595,103 @@ def test_divide_decimal(self): | |||
|
|||
assert_series_equal(expected, s) | |||
|
|||
def test_div(self): | |||
def test_i8_ser_div_i8_ser(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could parametrize on the .astype(np.float32 & np.float64), it doesn't make a real difference, but could check i guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your comments are all correct about parametrization and redundancy. This PR got split off of a branch that tried to address all of it at once and got huge because many cases currently fail. The idea is to split it first in a no-actual-changes-so-easy-to-review way, then apply the fixes one at a time.
result = p['first'] / p['second'] | ||
expected = Series(p['first'].values / p['second'].values) | ||
assert_series_equal(result, expected) | ||
def test_f8_ser_div_f8_ser(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, maybe make a fixture for float32/float64 (could put at in pandas/tests/conftest.py) and for ints/uints as well. I think there is an issue about doing this.
# GH 7785 | ||
p = DataFrame({'first': (1, 0), 'second': (-0.01, -0.02)}) | ||
expected = Series([-0.01, -np.inf]) | ||
def test_ser_div_ser_name_propagation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this a separate test and not in the div test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure what the original motivation was here. TBH I was just guessing that the point was name propagation, since it wasn't clear what else it could be for.
|
||
result = p['second'] / p['first'] | ||
assert_series_equal(result, expected) | ||
def test_int_div_pyint_zero(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parametrize over 0 and numpy int 0 scalars (and maybe uints for that matter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I ended up putting a func in pd.util.testing to generate variants of zeros. Holding off on it for now because many variants fail ATM.
expected = pd.Series([0.] * 5) | ||
result = zero_array / pd.Series(data) | ||
assert_series_equal(result, expected) | ||
def test_rdiv_zero_compat(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parametrie over 0 with various int/uint dtypes
result = pd.Series(zero_array) / pd.Series(data) | ||
assert_series_equal(result, expected) | ||
|
||
def test_div_zero(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate of test above
result = ser / 0 | ||
assert_series_equal(result, expected) | ||
|
||
def test_rdiv_zero(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parametrize
result = 0 / ser | ||
assert_series_equal(result, expected) | ||
|
||
def test_floordiv_div(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same. see the pattern :>
need to create a zero fixture
@@ -1574,28 +1600,28 @@ def test_dt64_series_add_intlike(self, tz): | |||
|
|||
|
|||
class TestSeriesOperators(TestData): | |||
def test_op_method(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment to describe what tests this class does
result = op(series, other) | ||
expected = alt(series, other) | ||
if op == 'div': | ||
alt = operator.truediv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of making check an inner function, just add another layer of paramaterization (for self.ts * 2 and self.ts[::2])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you put self.ts
in the parametrization? Unless you mean just replacing it with the evaluated series itself which I'd prefer anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is incorrect anyway since op == 'div'
should never be True; this is probably supposed to be if opname == div
; and in fact that check doesn't matter because we're excluding "div" on PY3 anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you would put callables, e.g.
```"ts", [lambda x: x * 2, lambda x: x[::2]]and just inject
self.ts``
e.g.
def test_.....(ts):
ts = ts(self.ts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds unnecessarily convoluted. Dummy lambdas and parametrization obfuscate what is being tested (which is already not-immediately-obvious, as you mentioned earlier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is standard pytest usage. and much much more clear than the current.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd like this changed
Just implemented most of the requests. The parametrization of zeros is being put off for now because specifying all of the existing failure modes is a task unto itself. |
@@ -595,77 +595,85 @@ def test_divide_decimal(self): | |||
|
|||
assert_series_equal(expected, s) | |||
|
|||
def test_div(self): | |||
@pytest.mark.parametrize('dtype2', [np.int64, np.int32, np.int16, np.int8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this be better as a fixture
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe eventually. ATM it is only used once so there's no upside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a style note, generally I would write this like
@pytest.mark.parameterize(
'dtype2',
[
np.int64, ......])
as it makes it shorter and more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd like this changed
Codecov Report
@@ Coverage Diff @@
## master #19336 +/- ##
==========================================
+ Coverage 91.67% 91.69% +0.02%
==========================================
Files 148 148
Lines 48553 48553
==========================================
+ Hits 44513 44523 +10
+ Misses 4040 4030 -10
Continue to review full report at Codecov.
|
result = op(series, other) | ||
expected = alt(series, other) | ||
if op == 'div': | ||
alt = operator.truediv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is standard pytest usage. and much much more clear than the current.
@@ -595,77 +595,85 @@ def test_divide_decimal(self): | |||
|
|||
assert_series_equal(expected, s) | |||
|
|||
def test_div(self): | |||
@pytest.mark.parametrize('dtype2', [np.int64, np.int32, np.int16, np.int8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a style note, generally I would write this like
@pytest.mark.parameterize(
'dtype2',
[
np.int64, ......])
as it makes it shorter and more readable
Noted for next time around. |
result = op(series, other) | ||
expected = alt(series, other) | ||
if op == 'div': | ||
alt = operator.truediv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd like this changed
@@ -595,77 +595,85 @@ def test_divide_decimal(self): | |||
|
|||
assert_series_equal(expected, s) | |||
|
|||
def test_div(self): | |||
@pytest.mark.parametrize('dtype2', [np.int64, np.int32, np.int16, np.int8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd like this changed
result = p['first'] / p['second'] | ||
expected = Series(p['first'].values / p['second'].values) | ||
assert_series_equal(result, expected) | ||
def test_ser_div_ser_name_propagation(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a dupe of the test right above
np.float64, np.float32, np.float16, | ||
np.uint64, np.uint32, | ||
np.uint16, np.uint8]) | ||
@pytest.mark.parametrize('dtype1', [np.int64, np.float64, np.uint64]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls clean the parameterize up here as I suggested
result = rop(series, other) | ||
expected = alt(other, series) | ||
assert_almost_equal(result, expected) | ||
@pytest.mark.parametrize('ts', [lambda x: (x, x * 2, False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much nicer. if you'd format as I suggest this become very readable.
I think this should be good to go. Will make the test-duplication discussed elsewhere easier. |
This is now a blocker for the ongoing centralizing-of-arithmetic tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small change, other lgtm. ping on green.
@pytest.mark.parametrize( | ||
'ts', | ||
[ | ||
lambda x: (x, x * 2, False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't write it like this rather split out parameters
@pytest.mark.parametrize(
'ts'
[
(lamda x: x, lamda x: x[::2], False),
....
])
its slightly longer but much more obvious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes I look at the results of these format changes and say "yah, he's right, that is more readable." This isn't one of those times. Doing it anyway.
ping |
Related: #19336 Author: Brock Mendel <[email protected]> Closes #19347 from jbrockmendel/div_zero2 and squashes the following commits: be1e2e1 [Brock Mendel] move fixture to conftest 64b0c08 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 aa969f8 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 000aefd [Brock Mendel] fix long again 9de356a [Brock Mendel] revert fixture to fix test_range failures b8cf21d [Brock Mendel] flake8 remove unused import afedba9 [Brock Mendel] whatsnew clarification b51c2e1 [Brock Mendel] fixturize 37efd51 [Brock Mendel] make zero a fixture 965f721 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 d648ef6 [Brock Mendel] requested edits 1ef3a6c [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 78de1a4 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 0277d9f [Brock Mendel] add ipython output to whatsnew 5d7e3ea [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 ea75c3c [Brock Mendel] ipython block 6fc61bd [Brock Mendel] elaborate docstring ca3bf42 [Brock Mendel] Whatsnew section cd54349 [Brock Mendel] move dispatch_missing to core.missing 06df02a [Brock Mendel] py3 fix 84c74c5 [Brock Mendel] remove operator.div for py3 6acc2f7 [Brock Mendel] fix missing import e0e89b9 [Brock Mendel] fix and and tests for divmod 969f342 [Brock Mendel] fix and test index division by zero
This is now a blocker for fixing series division by zero. |
thanks |
Related: pandas-dev#19336 Author: Brock Mendel <[email protected]> Closes pandas-dev#19347 from jbrockmendel/div_zero2 and squashes the following commits: be1e2e1 [Brock Mendel] move fixture to conftest 64b0c08 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 aa969f8 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 000aefd [Brock Mendel] fix long again 9de356a [Brock Mendel] revert fixture to fix test_range failures b8cf21d [Brock Mendel] flake8 remove unused import afedba9 [Brock Mendel] whatsnew clarification b51c2e1 [Brock Mendel] fixturize 37efd51 [Brock Mendel] make zero a fixture 965f721 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 d648ef6 [Brock Mendel] requested edits 1ef3a6c [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 78de1a4 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 0277d9f [Brock Mendel] add ipython output to whatsnew 5d7e3ea [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 ea75c3c [Brock Mendel] ipython block 6fc61bd [Brock Mendel] elaborate docstring ca3bf42 [Brock Mendel] Whatsnew section cd54349 [Brock Mendel] move dispatch_missing to core.missing 06df02a [Brock Mendel] py3 fix 84c74c5 [Brock Mendel] remove operator.div for py3 6acc2f7 [Brock Mendel] fix missing import e0e89b9 [Brock Mendel] fix and and tests for divmod 969f342 [Brock Mendel] fix and test index division by zero
The upcoming fix(es) for #19322 are going to involve parametrizing a bunch of variants of division by zero. This separates out the existing test cases into method-specific tests, some of which will be parametrized in upcoming PRs.
This PR does not change the aggregate contents of the tests.