fix and test index division by zero #19347

jbrockmendel · 2018-01-22T17:42:33Z

Related: #19336, this implements the most important parts of the parametrization requested there.

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jbrockmendel · 2018-01-22T17:43:41Z

pandas/core/indexes/range.py


                try:
                    # apply if we have an override
                    if step:
                        with np.errstate(all='ignore'):
-                            rstep = step(self._step, other)
+                            rstep = step(left._step, right)


Side-note: This is still not great; left may no longer be self, so this may raise an AttributeError. The existing method specifically catches AttributeError so I guess this is intentional.

use getattr

getattr(left, '_step', left) and remove the AttributeError catching

getattr(left, '_stop', left)

and remove the AttributeError catching

can you do this

Out of scope. Until I look at this closely, I'm assuming that the author had a reason for doing it this way.

ok fair enough

jbrockmendel · 2018-01-22T17:45:28Z

pandas/tests/indexes/test_numeric.py

+        result = op(idx, zero)
+        tm.assert_index_equal(result, expected)
+        ser_compat = op(Series(idx).astype('i8'), np.array(zero).astype('i8'))
+        tm.assert_series_equal(ser_compat, Series(result))


This is a "check that this matches the Series behavior" check. Note that because the Series implementation does not yet get this right for all dtypes, we are casting the Series version to int64 and checking that it matches the one version that does consistently work. Once the Series methods are fixed, this casting can be removed.

jreback · 2018-01-24T01:09:07Z

pandas/core/indexes/base.py

+    Parameters
+    ----------
+    op : function (operator.add, operator.div, ...)
+    left : object, usually Index


usually Index means?

That the input left is going to be an Index object most of the time, but could be an arbitrary object.

I know what it means, I am asking to have you elaborate in the note on these cases, IOW why would be it be an arbitrary object, eg.. when the op is reversed?

jreback · 2018-01-24T01:09:51Z

pandas/core/indexes/base.py

+
+    Returns
+    -------
+    result : ndarray


I think this function is better in missing

Sounds good.

jreback · 2018-01-24T01:12:42Z

pandas/core/missing.py

@@ -645,6 +645,44 @@ def fill_zeros(result, x, y, name, fill):
    return result


+def mask_zero_div_zero(x, y, result):


this is not dissimilar from fill_zeros which is not used very much at all. maybe can consolidate.

First attempt was to edit fill_zeros to make it work, but I ended up changing it more than I thought appropriate. I'll take another look.

jreback · 2018-01-24T01:14:11Z

pandas/core/missing.py

+
+    Returns
+    -------
+    filled_result : ndarray


the semantics you have here are not explained well. you are mutating result in-place, then returning it. is that intentional?

I'll flesh it out. The semantics were meant to follow fill_zeros as closely as possible.

ok some more expl would be helpful. filling result is ok, you just have to make this really clear.

jreback · 2018-01-24T01:14:51Z

pandas/tests/indexes/test_numeric.py

@@ -17,6 +17,11 @@

 from pandas.tests.indexes.common import Base

+# For testing division by (or of) zero for Series with length 5, this


these should be fixtures

jreback · 2018-01-24T01:15:28Z

pandas/util/testing.py

@@ -1964,6 +1964,32 @@ def add_nans_panel4d(panel4d):
    return panel4d


+def gen_zeros(arr_len):


move this to conftest as its just a fixture

This is going to be re-used in Series tests a few PRs from now. Which conftest should it go into?

you can put it in the top-level one pandas/conftest.py I would really like to move more things there

How do we pass the arr_len arg to it?

you don't you move this as a function, then create as a fixture.

So I move the function to conftest, then from test_numeric create a fixture like (?):

@pytest.fixture(params=[x for x in gen_zeros(5) if not isinstance(x, Series)]) def zero(request): # For testing division by (or of) zero for Series with length 5, this # gives several scalar-zeros and length-5 vector-zeros return request.param

Do we import conftest? I haven't seen that elsewhere. What's the upside of this over just having zeros in the module namespace?

yes. you don't need to import conftest. you can put this in the module only if its only used there. the point of conftest is to share.

since this is only used in 1 place, just move it to the test file itself and make it into a fixture

jreback · 2018-01-24T01:17:38Z

this needs a sub-section in whatsnew to show what has changed (e.g. Int and Range Index with / 0 and such)

codecov · 2018-01-24T18:00:47Z

Codecov Report

Merging #19347 into master will decrease coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #19347      +/-   ##
==========================================
- Coverage   91.69%   91.67%   -0.03%     
==========================================
  Files         148      148              
  Lines       48561    48586      +25     
==========================================
+ Hits        44530    44543      +13     
- Misses       4031     4043      +12

Flag	Coverage Δ
#multiple	`90.04% <100%> (-0.02%)`	⬇️
#single	`41.71% <22.5%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/range.py	`95.65% <100%> (-0.06%)`	⬇️
pandas/core/indexes/base.py	`96.45% <100%> (ø)`	⬆️
pandas/core/missing.py	`85.9% <100%> (+1.11%)`	⬆️
pandas/plotting/_converter.py	`65.22% <0%> (-1.74%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69cd5fb...be1e2e1. Read the comment docs.

jreback · 2018-01-25T01:20:09Z

doc/source/whatsnew/v0.23.0.txt

+Index Division By Zero Fills Correctly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Division operations on ``Index`` and subclasses will now fill positive / 0 with ``np.inf``, negative / 0 with ``-np.inf``, and 0 / 0 with ``np.nan``.  This matches existing Series behavior.


double backticks on Series. Add the issue number, can be multiple (e.g. this PR number and the xref issue)

jreback · 2018-01-25T01:20:56Z

doc/source/whatsnew/v0.23.0.txt

+    In [6]: index = pd.UInt64Index([0, 1])
+    In [7]: index / np.array([0, 0], dtype=np.uint64)
+    Out[7]: UInt64Index([0, 0], dtype='uint64')
+


switch the order of these blocks, we always write previous, then current.

jreback · 2018-01-25T01:21:10Z

doc/source/whatsnew/v0.23.0.txt

+
+Current Behavior:
+
+.. code-block:: ipython


current should always be ipython, so the code executes.

jreback · 2018-01-25T01:21:44Z

doc/source/whatsnew/v0.23.0.txt

+
+Previous Behavior:
+
+.. code-block:: ipython


code blocks here are fine, but you need to actually show the executed code as in an ipython session.

Not totally clear on the difference. I'll be pushing shortly with fixes to most other comments and a best-guess as to these.

jreback · 2018-01-25T01:23:11Z

pandas/core/missing.py

+
+    Returns
+    -------
+    filled_result : ndarray


ok some more expl would be helpful. filling result is ok, you just have to make this really clear.

jreback · 2018-01-25T01:24:40Z

pandas/util/testing.py

@@ -1964,6 +1964,32 @@ def add_nans_panel4d(panel4d):
    return panel4d


+def gen_zeros(arr_len):


you don't you move this as a function, then create as a fixture.

jreback · 2018-01-25T12:09:17Z

doc/source/whatsnew/v0.23.0.txt

+
+.. code-block:: ipython
+
+   index = pd.Index([-1, 0, 1])


a code-block needs to have a complete section copy pasted. refer to the sphinx docs

jreback · 2018-01-25T12:09:36Z

doc/source/whatsnew/v0.23.0.txt

+
+.. ipython:: python
+
+    index = pd.Index([-1, 0, 1])


break this into 2 blocks and give some comments

jreback · 2018-01-25T12:10:29Z

pandas/core/indexes/range.py

-                    results = op(self, other)
-                return Index(results, **attrs)
+                except (ValueError, TypeError, AttributeError,
+                        ZeroDivisionError):


when does this raise ZeroDivisionError? isn't he point of the errstate to catch that?

when does this raise ZeroDivisionError?

ATM pd.RangeIndex(0, 5) / 0 raises ZeroDivisionError.

isn't he point of the errstate to catch that?

I think errstate is to suppress warnings, but not really sure.

hmm, that seems wrong, but I guess can cover later

jreback · 2018-01-25T12:10:48Z

pandas/core/indexes/range.py


                try:
                    # apply if we have an override
                    if step:
                        with np.errstate(all='ignore'):
-                            rstep = step(self._step, other)
+                            rstep = step(left._step, right)


use getattr

jreback · 2018-01-25T12:12:31Z

pandas/util/testing.py

@@ -1964,6 +1964,32 @@ def add_nans_panel4d(panel4d):
    return panel4d


+def gen_zeros(arr_len):


yes. you don't need to import conftest. you can put this in the module only if its only used there. the point of conftest is to share.

jbrockmendel · 2018-01-25T18:12:04Z

the point of conftest is to share.

... isn't that also the point of pandas.util.testing?

jbrockmendel · 2018-01-25T18:13:31Z

use getattr

Huh?

…v_zero2

jbrockmendel · 2018-01-25T19:50:58Z

fastparquet errors on appveyor

jreback · 2018-01-26T12:33:55Z

doc/source/whatsnew/v0.23.0.txt

+    In [10]: pd.RangeIndex(1, 5) / 0
+    ---------------------------------------------------------------------------
+    ZeroDivisionError                         Traceback (most recent call last)
+    <ipython-input-10-4c5e91d516f3> in <module>()


you don't need the while traceback with line-number, just the error line (213) itself)

jreback · 2018-01-26T12:34:23Z

doc/source/whatsnew/v0.23.0.txt

+
+.. code-block:: ipython
+
+    In [6]: index = pd.Index([-1, 0, 1])


create this an an Int64Index

jreback · 2018-01-26T12:35:56Z

doc/source/whatsnew/v0.23.0.txt

+    index = pd.Index([-1, 0, 1])
+    index / 0
+
+    # The result of division by zero should not depend on whether the zero is int or float


remove this case (or add above as well). the point of the previous/current display is to make it easy to map up what is changing.

jreback · 2018-01-26T12:36:23Z

pandas/core/indexes/range.py


                try:
                    # apply if we have an override
                    if step:
                        with np.errstate(all='ignore'):
-                            rstep = step(self._step, other)
+                            rstep = step(left._step, right)


getattr(left, '_stop', left)

jreback · 2018-01-26T12:36:39Z

pandas/core/indexes/range.py


                try:
                    # apply if we have an override
                    if step:
                        with np.errstate(all='ignore'):
-                            rstep = step(self._step, other)
+                            rstep = step(left._step, right)


and remove the AttributeError catching

jreback · 2018-01-26T12:37:05Z

pandas/core/missing.py

+        nan_mask = (zmask & (x == 0)).ravel()
+        neginf_mask = (zmask & (x < 0)).ravel()
+        posinf_mask = (zmask & (x > 0)).ravel()
+


add some comments here

jreback · 2018-01-26T12:37:21Z

pandas/tests/indexes/test_numeric.py

@@ -17,6 +17,11 @@

 from pandas.tests.indexes.common import Base

+# For testing division by (or of) zero for Series with length 5, this
+# gives several scalar-zeros and length-5 vector-zeros
+zeros = tm.gen_zeros(5)


just make this a fixture, then you don't need to add a parameterize on each function

can you do this

…v_zero2

jreback · 2018-01-31T12:25:00Z

doc/source/whatsnew/v0.23.0.txt

+Index Division By Zero Fills Correctly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Division operations on ``Index`` and subclasses will now fill positive / 0 with ``np.inf``, negative / 0 with ``-np.inf``, and 0 / 0 with ``np.nan``.  This matches existing ``Series`` behavior. (:issue:`19322`, :issue:`19347`)


this is a confusing sentence, its not obvious that 'postive / 0' mean a positive number divided by zero. maybe use double-backticks around positive / 0

jreback · 2018-01-31T12:25:25Z

pandas/core/indexes/range.py


                try:
                    # apply if we have an override
                    if step:
                        with np.errstate(all='ignore'):
-                            rstep = step(self._step, other)
+                            rstep = step(left._step, right)


can you do this

jreback · 2018-01-31T12:25:47Z

pandas/tests/indexes/test_numeric.py

@@ -17,6 +17,11 @@

 from pandas.tests.indexes.common import Base

+# For testing division by (or of) zero for Series with length 5, this
+# gives several scalar-zeros and length-5 vector-zeros
+zeros = tm.gen_zeros(5)


can you do this

jreback · 2018-01-31T12:26:20Z

pandas/util/testing.py

@@ -1964,6 +1964,32 @@ def add_nans_panel4d(panel4d):
    return panel4d


+def gen_zeros(arr_len):


since this is only used in 1 place, just move it to the test file itself and make it into a fixture

…v_zero2

jbrockmendel · 2018-02-01T00:42:13Z

Made zeros a fixture in test_numeric as requested, but that broke tests in test_range b/c of pytest magic. Just pushed a commit that reverted to having zeros be a list in test_numeric namespace.

jreback · 2018-02-01T13:05:09Z

pandas/tests/indexes/test_numeric.py

@@ -18,6 +18,16 @@
 from pandas.tests.indexes.common import Base


+# For testing division by (or of) zero for Series with length 5, this


maybe you misunderstand what I want here

# gives several scalar-zeros and length-5 vector-zeros zeros = [box([0] * 5, dtype=dtype) for box in [pd.Index, np.array] for dtype in [np.int64, np.uint64, np.float64]] zeros.extend([np.array(0, dtype=dtype) for dtype in [np.int64, np.uint64, np.float64]]) zeros.extend([0, 0.0, long(0)]) @pytest.mark.fixture(params=zeros) def zero(request): return request.param

then remove the @pytest.mark.parmetrize on all the test functions

See previous comment:

Made zeros a fixture in test_numeric as requested, but that broke tests in test_range b/c of pytest magic

AFAICT the problem is caused by the fixture being in the namespace for test_numeric but not for test_range, which has a class that inherits from the test_numeric.Numeric (which defines the relevant tests). I guess we could get around that by putting this in a shared conftest file, but all that accomplishes is making it harder for a reader to find out where things are defined.

https://travis-ci.org/pandas-dev/pandas/jobs/335770424

so move it to the conftest, this is quite standard practice and avoids code duplication.

…v_zero2

jreback · 2018-02-02T11:31:57Z

pandas/tests/indexes/test_numeric.py

@@ -18,6 +18,16 @@
 from pandas.tests.indexes.common import Base


+# For testing division by (or of) zero for Series with length 5, this


so move it to the conftest, this is quite standard practice and avoids code duplication.

…v_zero2

jbrockmendel · 2018-02-05T20:05:42Z

@jreback I think the fixtures are all as requested. LMK.

jreback · 2018-02-06T01:26:58Z

doc/source/whatsnew/v0.23.0.txt

+
+    In [7]: index / 0
+    Out[7]: Int64Index([0, 0, 0], dtype='int64')
+


check this after its built to make sure it looks ok.

jreback · 2018-02-06T01:27:24Z

pandas/core/missing.py

+    >>> result      # raw numpy result does not fill division by zero
+    array([0, 0, 0])
+    >>> mask_zero_div_zero(x, y, result)
+    array([ inf,  nan, -inf])


pls add to the list to consolidate this module (e.g. the existing fill_zeros)

jreback · 2018-02-06T01:31:38Z

thanks. a couple of comments for the future.

Related: pandas-dev#19336 Author: Brock Mendel <[email protected]> Closes pandas-dev#19347 from jbrockmendel/div_zero2 and squashes the following commits: be1e2e1 [Brock Mendel] move fixture to conftest 64b0c08 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 aa969f8 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 000aefd [Brock Mendel] fix long again 9de356a [Brock Mendel] revert fixture to fix test_range failures b8cf21d [Brock Mendel] flake8 remove unused import afedba9 [Brock Mendel] whatsnew clarification b51c2e1 [Brock Mendel] fixturize 37efd51 [Brock Mendel] make zero a fixture 965f721 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 d648ef6 [Brock Mendel] requested edits 1ef3a6c [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 78de1a4 [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 0277d9f [Brock Mendel] add ipython output to whatsnew 5d7e3ea [Brock Mendel] Merge branch 'master' of https://github.com/pandas-dev/pandas into div_zero2 ea75c3c [Brock Mendel] ipython block 6fc61bd [Brock Mendel] elaborate docstring ca3bf42 [Brock Mendel] Whatsnew section cd54349 [Brock Mendel] move dispatch_missing to core.missing 06df02a [Brock Mendel] py3 fix 84c74c5 [Brock Mendel] remove operator.div for py3 6acc2f7 [Brock Mendel] fix missing import e0e89b9 [Brock Mendel] fix and and tests for divmod 969f342 [Brock Mendel] fix and test index division by zero

fix and test index division by zero

969f342

jbrockmendel commented Jan 22, 2018

View reviewed changes

jbrockmendel added 4 commits January 22, 2018 09:51

fix and and tests for divmod

e0e89b9

fix missing import

6acc2f7

remove operator.div for py3

84c74c5

py3 fix

06df02a

jreback requested changes Jan 24, 2018

View reviewed changes

jreback added Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations labels Jan 24, 2018

jbrockmendel added 2 commits January 23, 2018 19:20

move dispatch_missing to core.missing

cd54349

Whatsnew section

ca3bf42

jreback requested changes Jan 25, 2018

View reviewed changes

jbrockmendel added 2 commits January 24, 2018 19:33

elaborate docstring

6fc61bd

ipython block

ea75c3c

jreback requested changes Jan 25, 2018

View reviewed changes

jbrockmendel closed this Jan 25, 2018

jbrockmendel reopened this Jan 25, 2018

jbrockmendel added 2 commits January 25, 2018 10:15

Merge branch 'master' of https://github.com/pandas-dev/pandas into di…

5d7e3ea

…v_zero2

add ipython output to whatsnew

0277d9f

jreback requested changes Jan 26, 2018

View reviewed changes

jbrockmendel added 3 commits January 27, 2018 16:51

Merge branch 'master' of https://github.com/pandas-dev/pandas into di…

78de1a4

…v_zero2

Merge branch 'master' of https://github.com/pandas-dev/pandas into di…

1ef3a6c

…v_zero2

requested edits

d648ef6

jreback requested changes Jan 31, 2018

View reviewed changes

jbrockmendel added 6 commits January 31, 2018 08:09

Merge branch 'master' of https://github.com/pandas-dev/pandas into di…

965f721

…v_zero2

make zero a fixture

37efd51

fixturize

b51c2e1

whatsnew clarification

afedba9

flake8 remove unused import

b8cf21d

revert fixture to fix test_range failures

9de356a

fix long again

000aefd

jreback requested changes Feb 1, 2018

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into di…

aa969f8

…v_zero2

jreback requested changes Feb 2, 2018

View reviewed changes

jbrockmendel added 2 commits February 2, 2018 08:27

Merge branch 'master' of https://github.com/pandas-dev/pandas into di…

64b0c08

…v_zero2

move fixture to conftest

be1e2e1

jreback reviewed Feb 6, 2018

View reviewed changes

jreback added this to the 0.23.0 milestone Feb 6, 2018

jreback approved these changes Feb 6, 2018

View reviewed changes

jreback closed this in ed10bf6 Feb 6, 2018

jbrockmendel deleted the div_zero2 branch February 6, 2018 04:44

		@@ -645,6 +645,44 @@ def fill_zeros(result, x, y, name, fill):
		return result


		def mask_zero_div_zero(x, y, result):

		@@ -17,6 +17,11 @@

		from pandas.tests.indexes.common import Base

		# For testing division by (or of) zero for Series with length 5, this

		@@ -1964,6 +1964,32 @@ def add_nans_panel4d(panel4d):
		return panel4d


		def gen_zeros(arr_len):

		@@ -18,6 +18,16 @@
		from pandas.tests.indexes.common import Base


		# For testing division by (or of) zero for Series with length 5, this


		In [7]: index / 0
		Out[7]: Int64Index([0, 0, 0], dtype='int64')

fix and test index division by zero #19347

fix and test index division by zero #19347

Conversation

jbrockmendel commented Jan 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jan 24, 2018

codecov bot commented Jan 24, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Jan 25, 2018

jbrockmendel commented Jan 25, 2018

jbrockmendel commented Jan 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Feb 1, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Feb 5, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Feb 6, 2018

codecov bot commented Jan 24, 2018 •

edited

Loading