Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC:Remove hard-coded examples from _flex_doc_SERIES (#24589) #25524

Conversation

danielplawrence
Copy link
Contributor

Initial work on #24589

  • Removes hard-coded examples from _flex_doc_SERIES
  • Adds separate examples for each op (_*_example_SERIES). At this stage I've just copied the examples which were in the _flex_doc_SERIES template (add, sub, mul, div)
  • Modifies _make_flex_doc to format the _flex_doc_SERIES template with an example string from op_desc['series_examples']
  • adds references to each of the examples to _op_descriptions['series_examples'], allowing them to be picked up in _make_flex_doc

Ops outside not in [add, sub, mul, div] will return their docstring with no examples in this revision.

@codecov
Copy link

codecov bot commented Mar 3, 2019

Codecov Report

Merging #25524 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25524      +/-   ##
==========================================
+ Coverage   91.75%   91.76%   +<.01%     
==========================================
  Files         173      173              
  Lines       52960    52964       +4     
==========================================
+ Hits        48595    48600       +5     
+ Misses       4365     4364       -1
Flag Coverage Δ
#multiple 90.33% <100%> (ø) ⬆️
#single 41.71% <100%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/ops.py 91.8% <100%> (+0.04%) ⬆️
pandas/util/testing.py 87.66% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42b4c97...2f6173c. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 3, 2019

Codecov Report

Merging #25524 into master will decrease coverage by 0.49%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #25524     +/-   ##
=========================================
- Coverage   91.75%   91.26%   -0.5%     
=========================================
  Files         173      173             
  Lines       52960    52970     +10     
=========================================
- Hits        48595    48341    -254     
- Misses       4365     4629    +264
Flag Coverage Δ
#multiple 89.83% <100%> (-0.5%) ⬇️
#single 41.72% <100%> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/ops.py 91.74% <100%> (-0.03%) ⬇️
pandas/core/panel.py 38.56% <0%> (-33.19%) ⬇️
pandas/core/sparse/series.py 93.3% <0%> (-2.24%) ⬇️
pandas/core/indexing.py 90.88% <0%> (-1.41%) ⬇️
pandas/core/internals/managers.py 93.92% <0%> (-0.92%) ⬇️
pandas/core/generic.py 93.64% <0%> (-0.53%) ⬇️
pandas/core/frame.py 96.79% <0%> (-0.06%) ⬇️
pandas/core/series.py 93.68% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42b4c97...6d7ed91. Read the comment docs.

d 1.0
e NaN
dtype: float64
>>> a.multiply(b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe provide a fill_value here, to be consistent with the others?

@TomAugspurger
Copy link
Contributor

Thanks for working on this. I think it's OK that other ops like mod (temporarily) lack examples.

Two questions

  1. Maybe add a fill_value for Series.multiply
  2. How to test this? It'd be nice if we could run the doctests, but I don't know how easy that is due to how we define these methods. Do you have any thoughts?

@TomAugspurger TomAugspurger added this to the 0.25.0 milestone Mar 3, 2019
@pep8speaks
Copy link

pep8speaks commented Mar 4, 2019

Hello @danielplawrence! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-03-05 21:10:44 UTC

@danielplawrence
Copy link
Contributor Author

Brief update:

  • I've gone through and added examples for several other ops. We now have:
    • add
    • sub
    • mul
    • mod
    • pow
    • truediv
    • floordiv
  • These are all just examples with the same input arrays of 1s and NaN which were in the previous docstring.
  • Added fill_value=0 to each example as suggested above.
  • Didn't do divmod as I couldn't get it to work. divmod() works fine, but Series.divmod() fails with a ValueError, see example below.
Series.divmod examples
>>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> divmod(a,b)
(a    1.0
b    NaN
c    NaN
d    NaN
e    NaN
dtype: float64, a    0.0
b    NaN
c    NaN
d    NaN
e    NaN
dtype: float64)
>>> a.divmod(b)
Traceback (most recent call last):
  File "", line 1, in 
  File "/Users/danlaw/Projects/pandas/pandas/core/ops.py", line 2046, in flex_wrapper
    return self._binop(other, op, level=level, fill_value=fill_value)
  File "/Users/danlaw/Projects/pandas/pandas/core/series.py", line 2522, in _binop
    result = self._constructor(result, index=new_index, name=name)
  File "/Users/danlaw/Projects/pandas/pandas/core/series.py", line 250, in __init__
    .format(val=len(data), ind=len(index)))
ValueError: Length of passed values is 2, index implies 5
  • Checked the doctests with validate_docstrings.py, these are validating correctly and fail if I introduce incorrect examples.

  • Doctests raised several errors: (full output attached below)

    • Complained about the spaces introduced by using a format string at the end of the template. I've fixed this so that _make_flex_doc() simply concatenates the base doc with the examples -- this removes trailing whitespace from the cases with no examples.
    • These errors:
      a) Parameters {axis} not documented
      It seems that Series ops can take a parameter which raises 'No axis named a for object type <class 'type'>' for values other than 0. Should this parameter be removed from Series? Doesn't seem to make sense to document it unless I'm missing something. I've added an example at the bottom of this comment.
      b) Parameter "other" has no description
      Appears to be a false positive
      c) Missing description for See Also "Series.rpow" reference
      False positive?
  • I see that Azure piplines threw a linting failure (link), I'll look into it ASAP

Full output of validate_docstrings.py
(pandas-dev) 8c8590165d94% for op in add sub mul mod pow truediv floordiv
do
python3 scripts/validate_docstrings.py pandas.Series.$op
done

################################################################################
######################## Docstring (pandas.Series.add) ########################
################################################################################

Return Addition of series and other, element-wise (binary operator add).

Equivalent to series + other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.radd

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.add(b, fill_value=0)
a 2.0
b 1.0
c 1.0
d 1.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.radd" reference

################################################################################
######################## Docstring (pandas.Series.sub) ########################
################################################################################

Return Subtraction of series and other, element-wise (binary operator sub).

Equivalent to series - other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.rsub

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.subtract(b, fill_value=0)
a 0.0
b 1.0
c 1.0
d -1.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.rsub" reference

################################################################################
######################## Docstring (pandas.Series.mul) ########################
################################################################################

Return Multiplication of series and other, element-wise (binary operator mul).

Equivalent to series * other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.rmul

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.multiply(b, fill_value=0)
a 1.0
b 0.0
c 0.0
d 0.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.rmul" reference

################################################################################
######################## Docstring (pandas.Series.mod) ########################
################################################################################

Return Modulo of series and other, element-wise (binary operator mod).

Equivalent to series % other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.rmod

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.mod(b, fill_value=0)
a 0.0
b NaN
c NaN
d 0.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.rmod" reference

################################################################################
######################## Docstring (pandas.Series.pow) ########################
################################################################################

Return Exponential power of series and other, element-wise (binary operator pow).

Equivalent to series ** other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.rpow

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.pow(b, fill_value=0)
a 1.0
b 1.0
c 1.0
d 0.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.rpow" reference

################################################################################
###################### Docstring (pandas.Series.truediv) ######################
################################################################################

Return Floating division of series and other, element-wise (binary operator truediv).

Equivalent to series / other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.rtruediv

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.divide(b, fill_value=0)
a 1.0
b inf
c inf
d 0.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.rtruediv" reference

################################################################################
###################### Docstring (pandas.Series.floordiv) ######################
################################################################################

Return Integer division of series and other, element-wise (binary operator floordiv).

Equivalent to series // other, but with support to substitute a fill_value for
missing data in one of the inputs.

Parameters

other : Series or scalar value
fill_value : None or float value, default None (NaN)
Fill existing missing (NaN) values, and any new element needed for
successful Series alignment, with this value before computation.
If data in both corresponding Series locations is missing
the result will be missing.
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level.

Returns

Series
The result of the operation.

See Also

Series.rfloordiv

Examples

a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
a
a 1.0
b 1.0
c 1.0
d NaN
dtype: float64
b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
b
a 1.0
b NaN
d 1.0
e NaN
dtype: float64
a.floordiv(b, fill_value=0)
a 1.0
b NaN
c NaN
d 0.0
e NaN
dtype: float64

################################################################################
################################## Validation ##################################
################################################################################

3 Errors found:
Parameters {axis} not documented
Parameter "other" has no description
Missing description for See Also "Series.rfloordiv" reference

Example ValueError from axis parameter
>>> a = pd.Series([1, 1, 1, np.nan], index=['a', 'b', 'c', 'd'])
>>> b = pd.Series([1, np.nan, 1, np.nan], index=['a', 'b', 'd', 'e'])
>>> a.add(b)
a    2.0
b    NaN
c    NaN
d    NaN
e    NaN
dtype: float64
>>> a.add(b,axis=0)
a    2.0
b    NaN
c    NaN
d    NaN
e    NaN
dtype: float64
>>> a.add(b,axis=1)
Traceback (most recent call last):
  File "", line 1, in 
  File "/Users/danlaw/Projects/pandas/pandas/core/ops.py", line 2044, in flex_wrapper
    self._get_axis_number(axis)
  File "/Users/danlaw/Projects/pandas/pandas/core/generic.py", line 361, in _get_axis_number
    .format(axis, type(cls)))
ValueError: No axis named 1 for object type 

doc = base_doc.format(desc=op_desc['desc'], op_name=op_name,
equiv=equiv, reverse=op_desc['reverse'])
doc_no_examples = base_doc.format(desc=op_desc['desc'],
op_name=op_name, equiv=equiv, reverse=op_desc['reverse'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is what failed the build. Should be aligned with the open paren on the previous line, or move desc=... down here.

@jreback
Copy link
Contributor

jreback commented Mar 10, 2019

@TomAugspurger

@TomAugspurger
Copy link
Contributor

Thanks @danielplawrence. The divmod failure is being worked on elsewhere, and documenting axis is out of scope for this PR.

@TomAugspurger TomAugspurger merged commit 3099773 into pandas-dev:master Mar 11, 2019
thoo added a commit to thoo/pandas that referenced this pull request Mar 11, 2019
* upstream/master: (110 commits)
  DOC: hardcode contributors for 0.24.x releases (pandas-dev#25662)
  DOC: restore toctree maxdepth (pandas-dev#25134)
  BUG: Redefine IndexOpsMixin.size, fix pandas-dev#25580. (pandas-dev#25584)
  BUG: to_csv line endings with compression (pandas-dev#25625)
  DOC: file obj for to_csv must be newline='' (pandas-dev#25624)
  Suppress incorrect warning in nargsort for timezone-aware DatetimeIndex (pandas-dev#25629)
  TST: fix incorrect sparse test (now failing on scipy master) (pandas-dev#25653)
  CLN: Removed debugging code (pandas-dev#25647)
  DOC: require Return section only if return is not None nor commentary (pandas-dev#25008)
  DOC:Remove hard-coded examples from _flex_doc_SERIES (pandas-dev#24589) (pandas-dev#25524)
  TST: xref pandas-dev#25630 (pandas-dev#25643)
  BUG: Fix pandas-dev#25481 by fixing the error message in TypeError (pandas-dev#25540)
  Fixturize tests/frame/test_mutate_columns.py (pandas-dev#25642)
  Fixturize tests/frame/test_join.py (pandas-dev#25639)
  Fixturize tests/frame/test_combine_concat.py (pandas-dev#25634)
  Fixturize tests/frame/test_asof.py (pandas-dev#25628)
  BUG: Fix user-facing AssertionError with to_html (pandas-dev#25608) (pandas-dev#25620)
  DOC: resolve all GL03 docstring validation errors (pandas-dev#25525)
  TST: failing wheel building on PY2 and old numpy (pandas-dev#25631)
  DOC: Remove makePanel from docs (pandas-dev#25609) (pandas-dev#25612)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants