-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: update the pandas.DataFrame.hist docstring #20113
DOC: update the pandas.DataFrame.hist docstring #20113
Conversation
pandas/plotting/_core.py
Outdated
@@ -2130,48 +2130,70 @@ def hist_frame(data, column=None, by=None, grid=True, xlabelsize=None, | |||
""" | |||
Draw histogram of the DataFrame's series using matplotlib / pylab. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the / pylab
. It's been deprecated for ~5 years now :)
pandas/plotting/_core.py
Outdated
@@ -2130,48 +2130,70 @@ def hist_frame(data, column=None, by=None, grid=True, xlabelsize=None, | |||
""" | |||
Draw histogram of the DataFrame's series using matplotlib / pylab. | |||
|
|||
A histogram is a representation of the distribution of numerical data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strike numerical as hist can handle non-numeric data as well.
pandas/plotting/_core.py
Outdated
@@ -2130,48 +2130,70 @@ def hist_frame(data, column=None, by=None, grid=True, xlabelsize=None, | |||
""" | |||
Draw histogram of the DataFrame's series using matplotlib / pylab. | |||
|
|||
A histogram is a representation of the distribution of numerical data. | |||
This function wraps the matplotlib histogram function for each serie in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serie -> series.
pandas/plotting/_core.py
Outdated
ax : matplotlib axes object, default None | ||
Rotation of y axis labels. | ||
ax : Matplotlib axes object, default None | ||
The histogram axes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The axes to plot the histogram on."
pandas/plotting/_core.py
Outdated
sharex : boolean, default True if ax is None else False | ||
In case subplots=True, share x axis and set some x axis labels to | ||
invisible; defaults to True if ax is None otherwise False if an ax | ||
is passed in; Be aware, that passing in both an ax and sharex=True | ||
will alter all x axis labels for all subplots in a figure! | ||
will alter all x axis labels for all subplots in a figure!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove the .
since the period already ends in a punctuation character. Did the script complain about ending with a !
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the validator script complains:
Errors in parameters section
Parameter "sharex" description should finish with "."
I'm rewriting it to remove the !
.
pandas/plotting/_core.py
Outdated
figsize : tuple | ||
The size of the figure to create in inches by default | ||
The size of the figure to create in inches by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
The size in inches of the figure to create. Uses the value in `matplotlib.rcParams` by default.
pandas/plotting/_core.py
Outdated
bins : integer or sequence, default 10 | ||
Number of histogram bins to be used. If an integer is given, bins + 1 | ||
bin edges are calculated and returned. If bins is a sequence, gives | ||
bin edges, including left edge of first bin and right edge of last | ||
bin. In this case, bins is returned unmodified. | ||
`**kwds` : other plotting keyword arguments | ||
To be passed to hist function | ||
kwds : Keyword Arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong, but I think we're just doing these as
kwds:
All other plotting ...
IOW no type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That raises some errors in the validator:
Errors in parameters section
Parameters {'kwds'} not documented
Unknown parameters {'kwds :'}
Parameter "kwds :" has no type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write kwds : optional
.
The validation script is wrong @datapythonista team is fixing it from London. Use always **kwargs
as parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done!
pandas/plotting/_core.py
Outdated
|
||
Returns | ||
------- | ||
axes : matplotlib.AxesSubplot or np.array of them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.array
-> numpy.ndarray
(np.array is the function, np.ndarray is the object).
8898b45
to
d7a1ecd
Compare
Fixed all isues (except the Thanks for the review! |
d7a1ecd
to
8f5c449
Compare
pandas/plotting/_core.py
Outdated
@@ -2128,50 +2128,74 @@ def hist_frame(data, column=None, by=None, grid=True, xlabelsize=None, | |||
xrot=None, ylabelsize=None, yrot=None, ax=None, sharex=False, | |||
sharey=False, figsize=None, layout=None, bins=10, **kwds): | |||
""" | |||
Draw histogram of the DataFrame's series using matplotlib / pylab. | |||
Draw histogram of the DataFrame's Series using matplotlib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this is written like this in other parts of the docs, but I personally find it easier to understand to say something like Draw histogram of the DataFrame's columns
, instead of DataFrame's Series
.
pandas/plotting/_core.py
Outdated
xrot : float, default None | ||
rotation of x axis labels | ||
Rotation of x axis labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some explanation like "For example, a value of 90
displays the x labels rotated 90º clockwise".
pandas/plotting/_core.py
Outdated
yrot : float, default None | ||
rotation of y axis labels | ||
ax : matplotlib axes object, default None | ||
Rotation of y axis labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, make clear what a value of 90 means, for example.
pandas/plotting/_core.py
Outdated
is passed in; Be aware, that passing in both an ax and sharex=True | ||
will alter all x axis labels for all subplots in a figure! | ||
is passed in. | ||
Be aware: passing in both an ax and sharex=True will alter all x axis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say "Note" instead of "Be aware", which feels a bit more agressive.
pandas/plotting/_core.py
Outdated
To be passed to hist function | ||
kwds : Keyword Arguments | ||
All other plotting keyword arguments to be passed to | ||
matplotlib's boxplot function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to this function using :method: or :class:.
pandas/plotting/_core.py
Outdated
:context: close-figs | ||
|
||
>>> df = pd.DataFrame({ | ||
... 'length': [ 1.5, 0.5, 1.2, 0.9, 3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind the spacing: [ 1.5, ...
. In next lines, there are two spaces between the 0.2 and the 1.1.
pandas/plotting/_core.py
Outdated
.. plot:: | ||
:context: close-figs | ||
|
||
>>> df = pd.DataFrame({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write some introduction explaining in words what the code does?
pandas/plotting/_core.py
Outdated
... 'length': [ 1.5, 0.5, 1.2, 0.9, 3], | ||
... 'width': [ 0.7, 0.2, 0.15, 0.2, 1.1] | ||
... }, index= ['pig', 'rabbit', 'duck', 'chicken', 'horse']) | ||
>>> hist = df.hist(bins=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of arguments for this method. Can you write a couple more examples trying to show how do they behave?
8f5c449
to
8995dc3
Compare
pandas/plotting/_core.py
Outdated
|
||
A histogram is a representation of the distribution of data. | ||
This function wraps the matplotlib histogram function for each series in | ||
the DataFrame. It returns an array with a plot for each histogram. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you link to wikipedia (here or in Notes is ok)
pandas/plotting/_core.py
Outdated
|
||
A histogram is a representation of the distribution of data. | ||
This function wraps the matplotlib histogram function for each series in | ||
the DataFrame. It returns an array with a plot for each histogram. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"It returns an array with a plot for each histogram" is a bit misleading. What about "A histogram is plotted for every column of the DataFrame"?
pandas/plotting/_core.py
Outdated
@@ -2128,50 +2128,79 @@ def hist_frame(data, column=None, by=None, grid=True, xlabelsize=None, | |||
xrot=None, ylabelsize=None, yrot=None, ax=None, sharex=False, | |||
sharey=False, figsize=None, layout=None, bins=10, **kwds): | |||
""" | |||
Draw histogram of the DataFrame's series using matplotlib / pylab. | |||
Draw histogram of the DataFrame's columns using matplotlib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in other cases, I believe that we should delay the fact that it is using matplotlib under the hood to the extended description and get it out of the short one.
Codecov Report
@@ Coverage Diff @@
## master #20113 +/- ##
==========================================
- Coverage 91.72% 91.7% -0.03%
==========================================
Files 150 150
Lines 49156 49156
==========================================
- Hits 45090 45078 -12
- Misses 4066 4078 +12
Continue to review full report at Codecov.
|
Thanks @DZPM ! |
scripts/validate_docstrings.py pandas.DataFrame.hist
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single pandas.DataFrame.hist