Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separate numeric tests so we can isolate division by zero #19336

Merged
merged 9 commits into from
Feb 8, 2018

Conversation

jbrockmendel
Copy link
Member

The upcoming fix(es) for #19322 are going to involve parametrizing a bunch of variants of division by zero. This separates out the existing test cases into method-specific tests, some of which will be parametrized in upcoming PRs.

This PR does not change the aggregate contents of the tests.

@jreback jreback added Testing pandas testing functions or related to the test suite Numeric Operations Arithmetic, Comparison, and Logical operations labels Jan 22, 2018
@@ -595,77 +595,103 @@ def test_divide_decimal(self):

assert_series_equal(expected, s)

def test_div(self):
def test_i8_ser_div_i8_ser(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could parametrize on the .astype(np.float32 & np.float64), it doesn't make a real difference, but could check i guess

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your comments are all correct about parametrization and redundancy. This PR got split off of a branch that tried to address all of it at once and got huge because many cases currently fail. The idea is to split it first in a no-actual-changes-so-easy-to-review way, then apply the fixes one at a time.

result = p['first'] / p['second']
expected = Series(p['first'].values / p['second'].values)
assert_series_equal(result, expected)
def test_f8_ser_div_f8_ser(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, maybe make a fixture for float32/float64 (could put at in pandas/tests/conftest.py) and for ints/uints as well. I think there is an issue about doing this.

# GH 7785
p = DataFrame({'first': (1, 0), 'second': (-0.01, -0.02)})
expected = Series([-0.01, -np.inf])
def test_ser_div_ser_name_propagation(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a separate test and not in the div test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure what the original motivation was here. TBH I was just guessing that the point was name propagation, since it wasn't clear what else it could be for.


result = p['second'] / p['first']
assert_series_equal(result, expected)
def test_int_div_pyint_zero(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize over 0 and numpy int 0 scalars (and maybe uints for that matter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I ended up putting a func in pd.util.testing to generate variants of zeros. Holding off on it for now because many variants fail ATM.

expected = pd.Series([0.] * 5)
result = zero_array / pd.Series(data)
assert_series_equal(result, expected)
def test_rdiv_zero_compat(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrie over 0 with various int/uint dtypes

result = pd.Series(zero_array) / pd.Series(data)
assert_series_equal(result, expected)

def test_div_zero(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate of test above

result = ser / 0
assert_series_equal(result, expected)

def test_rdiv_zero(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parametrize

result = 0 / ser
assert_series_equal(result, expected)

def test_floordiv_div(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. see the pattern :>

need to create a zero fixture

@@ -1574,28 +1600,28 @@ def test_dt64_series_add_intlike(self, tz):


class TestSeriesOperators(TestData):
def test_op_method(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment to describe what tests this class does

result = op(series, other)
expected = alt(series, other)
if op == 'div':
alt = operator.truediv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of making check an inner function, just add another layer of paramaterization (for self.ts * 2 and self.ts[::2])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you put self.ts in the parametrization? Unless you mean just replacing it with the evaluated series itself which I'd prefer anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is incorrect anyway since op == 'div' should never be True; this is probably supposed to be if opname == div; and in fact that check doesn't matter because we're excluding "div" on PY3 anyway...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you would put callables, e.g.
```"ts", [lambda x: x * 2, lambda x: x[::2]]and just injectself.ts``

e.g.

def test_.....(ts):
       ts = ts(self.ts)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds unnecessarily convoluted. Dummy lambdas and parametrization obfuscate what is being tested (which is already not-immediately-obvious, as you mentioned earlier).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is standard pytest usage. and much much more clear than the current.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like this changed

@jbrockmendel
Copy link
Member Author

Just implemented most of the requests. The parametrization of zeros is being put off for now because specifying all of the existing failure modes is a task unto itself.

@@ -595,77 +595,85 @@ def test_divide_decimal(self):

assert_series_equal(expected, s)

def test_div(self):
@pytest.mark.parametrize('dtype2', [np.int64, np.int32, np.int16, np.int8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this be better as a fixture

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe eventually. ATM it is only used once so there's no upside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a style note, generally I would write this like

@pytest.mark.parameterize(
    'dtype2',
     [
        np.int64, ......])

as it makes it shorter and more readable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like this changed

@codecov
Copy link

codecov bot commented Jan 23, 2018

Codecov Report

Merging #19336 into master will increase coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19336      +/-   ##
==========================================
+ Coverage   91.67%   91.69%   +0.02%     
==========================================
  Files         148      148              
  Lines       48553    48553              
==========================================
+ Hits        44513    44523      +10     
+ Misses       4040     4030      -10
Flag Coverage Δ
#multiple 90.06% <ø> (+0.02%) ⬆️
#single 41.72% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 83.64% <0%> (-0.21%) ⬇️
pandas/plotting/_converter.py 66.95% <0%> (+1.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35812ea...4b78542. Read the comment docs.

result = op(series, other)
expected = alt(series, other)
if op == 'div':
alt = operator.truediv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is standard pytest usage. and much much more clear than the current.

@@ -595,77 +595,85 @@ def test_divide_decimal(self):

assert_series_equal(expected, s)

def test_div(self):
@pytest.mark.parametrize('dtype2', [np.int64, np.int32, np.int16, np.int8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a style note, generally I would write this like

@pytest.mark.parameterize(
    'dtype2',
     [
        np.int64, ......])

as it makes it shorter and more readable

@jbrockmendel
Copy link
Member Author

as a style note, generally I would write this like

Noted for next time around.

result = op(series, other)
expected = alt(series, other)
if op == 'div':
alt = operator.truediv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like this changed

@@ -595,77 +595,85 @@ def test_divide_decimal(self):

assert_series_equal(expected, s)

def test_div(self):
@pytest.mark.parametrize('dtype2', [np.int64, np.int32, np.int16, np.int8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like this changed

result = p['first'] / p['second']
expected = Series(p['first'].values / p['second'].values)
assert_series_equal(result, expected)
def test_ser_div_ser_name_propagation(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a dupe of the test right above

np.float64, np.float32, np.float16,
np.uint64, np.uint32,
np.uint16, np.uint8])
@pytest.mark.parametrize('dtype1', [np.int64, np.float64, np.uint64])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls clean the parameterize up here as I suggested

result = rop(series, other)
expected = alt(other, series)
assert_almost_equal(result, expected)
@pytest.mark.parametrize('ts', [lambda x: (x, x * 2, False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much nicer. if you'd format as I suggest this become very readable.

@jbrockmendel
Copy link
Member Author

I think this should be good to go. Will make the test-duplication discussed elsewhere easier.

@jbrockmendel
Copy link
Member Author

This is now a blocker for the ongoing centralizing-of-arithmetic tests.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small change, other lgtm. ping on green.

@pytest.mark.parametrize(
'ts',
[
lambda x: (x, x * 2, False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't write it like this rather split out parameters

@pytest.mark.parametrize(
'ts'
[
    (lamda x: x, lamda x: x[::2], False),
....
])

its slightly longer but much more obvious

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes I look at the results of these format changes and say "yah, he's right, that is more readable." This isn't one of those times. Doing it anyway.

@jbrockmendel
Copy link
Member Author

ping

jreback pushed a commit that referenced this pull request Feb 6, 2018
Related: #19336

Author: Brock Mendel <[email protected]>

Closes #19347 from jbrockmendel/div_zero2 and squashes the following commits:

be1e2e1 [Brock Mendel] move fixture to conftest
64b0c08 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
aa969f8 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
000aefd [Brock Mendel] fix long again
9de356a [Brock Mendel] revert fixture to fix test_range failures
b8cf21d [Brock Mendel] flake8 remove unused import
afedba9 [Brock Mendel] whatsnew clarification
b51c2e1 [Brock Mendel] fixturize
37efd51 [Brock Mendel] make zero a fixture
965f721 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
d648ef6 [Brock Mendel] requested edits
1ef3a6c [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
78de1a4 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
0277d9f [Brock Mendel] add ipython output to whatsnew
5d7e3ea [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
ea75c3c [Brock Mendel] ipython block
6fc61bd [Brock Mendel] elaborate docstring
ca3bf42 [Brock Mendel] Whatsnew section
cd54349 [Brock Mendel] move dispatch_missing to core.missing
06df02a [Brock Mendel] py3 fix
84c74c5 [Brock Mendel] remove operator.div for py3
6acc2f7 [Brock Mendel] fix missing import
e0e89b9 [Brock Mendel] fix and and tests for divmod
969f342 [Brock Mendel] fix and test index division by zero
@jbrockmendel
Copy link
Member Author

This is now a blocker for fixing series division by zero.

@jreback jreback added this to the 0.23.0 milestone Feb 8, 2018
@jreback jreback merged commit 34b86fd into pandas-dev:master Feb 8, 2018
@jreback
Copy link
Contributor

jreback commented Feb 8, 2018

thanks

@jbrockmendel jbrockmendel deleted the split_operators branch February 8, 2018 14:14
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Related: pandas-dev#19336

Author: Brock Mendel <[email protected]>

Closes pandas-dev#19347 from jbrockmendel/div_zero2 and squashes the following commits:

be1e2e1 [Brock Mendel] move fixture to conftest
64b0c08 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
aa969f8 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
000aefd [Brock Mendel] fix long again
9de356a [Brock Mendel] revert fixture to fix test_range failures
b8cf21d [Brock Mendel] flake8 remove unused import
afedba9 [Brock Mendel] whatsnew clarification
b51c2e1 [Brock Mendel] fixturize
37efd51 [Brock Mendel] make zero a fixture
965f721 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
d648ef6 [Brock Mendel] requested edits
1ef3a6c [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
78de1a4 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
0277d9f [Brock Mendel] add ipython output to whatsnew
5d7e3ea [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2
ea75c3c [Brock Mendel] ipython block
6fc61bd [Brock Mendel] elaborate docstring
ca3bf42 [Brock Mendel] Whatsnew section
cd54349 [Brock Mendel] move dispatch_missing to core.missing
06df02a [Brock Mendel] py3 fix
84c74c5 [Brock Mendel] remove operator.div for py3
6acc2f7 [Brock Mendel] fix missing import
e0e89b9 [Brock Mendel] fix and and tests for divmod
969f342 [Brock Mendel] fix and test index division by zero
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Numeric Operations Arithmetic, Comparison, and Logical operations Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants