-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Fix quantile docstring #22906
DOC: Fix quantile docstring #22906
Changes from all commits
72d8082
6ad3c1d
dce5a3e
4910bbe
ef09ccc
39081e1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7390,20 +7390,26 @@ def f(s): | |
def quantile(self, q=0.5, axis=0, numeric_only=True, | ||
interpolation='linear'): | ||
""" | ||
Return values at the given quantile over requested axis. | ||
Return value(s) at the given quantile over requested axis. | ||
|
||
This function returns the Series of 'q' quantile value(s) | ||
from the DataFrame, dividing data points into groups | ||
along `axis` axis | ||
In case of insufficient number of data points for clean division | ||
into groups, specify `interpolation` scheme to implement. | ||
|
||
Parameters | ||
---------- | ||
q : float or array-like, default 0.5 (50% quantile) | ||
0 <= q <= 1, the quantile(s) to compute | ||
axis : {0, 1, 'index', 'columns'} (default 0) | ||
0 or 'index' for row-wise, 1 or 'columns' for column-wise | ||
numeric_only : boolean, default True | ||
q : float or array-like, default 0.5 | ||
The quantile(s) to compute, | ||
should be a float between 0 and 1 (inclusive), | ||
0.5 is equivalent to calculate 50% quantile value ie the median. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you finish the lines to close to 79 characters long. Feels a bit weird to have these short lines. Can you use |
||
axis : {0 or 'index', 1 or 'columns'}, default 0 | ||
For row-wise : 0 or'index', for column-wise : 1 or 'columns'. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there is a typo here, but anyway this seems redundant with the type line. Can you check other docstrings with the |
||
numeric_only : bool, default True | ||
If False, the quantile of datetime and timedelta data will be | ||
computed as well | ||
computed as well. | ||
interpolation : {'linear', 'lower', 'higher', 'midpoint', 'nearest'} | ||
.. versionadded:: 0.18.0 | ||
|
||
This optional parameter specifies the interpolation method to use, | ||
when the desired quantile lies between two data points `i` and `j`: | ||
|
||
|
@@ -7416,46 +7422,58 @@ def quantile(self, q=0.5, axis=0, numeric_only=True, | |
|
||
Returns | ||
------- | ||
quantiles : Series or DataFrame | ||
|
||
- If ``q`` is an array, a DataFrame will be returned where the | ||
index is ``q``, the columns are the columns of self, and the | ||
values are the quantiles. | ||
- If ``q`` is a float, a Series will be returned where the | ||
index is the columns of self and the values are the quantiles. | ||
|
||
Examples | ||
-------- | ||
|
||
>>> df = pd.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]), | ||
columns=['a', 'b']) | ||
>>> df.quantile(.1) | ||
a 1.3 | ||
b 3.7 | ||
dtype: float64 | ||
>>> df.quantile([.1, .5]) | ||
a b | ||
0.1 1.3 3.7 | ||
0.5 2.5 55.0 | ||
|
||
Specifying `numeric_only=False` will also compute the quantile of | ||
datetime and timedelta data. | ||
|
||
>>> df = pd.DataFrame({'A': [1, 2], | ||
'B': [pd.Timestamp('2010'), | ||
pd.Timestamp('2011')], | ||
'C': [pd.Timedelta('1 days'), | ||
pd.Timedelta('2 days')]}) | ||
>>> df.quantile(0.5, numeric_only=False) | ||
A 1.5 | ||
B 2010-07-02 12:00:00 | ||
C 1 days 12:00:00 | ||
Name: 0.5, dtype: object | ||
Series | ||
|
||
See Also | ||
-------- | ||
pandas.core.window.Rolling.quantile | ||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Returns the rolling quantile for the DataFrame. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the |
||
numpy.percentile | ||
Returns 'nth' percentile for numpy arrays. | ||
|
||
Examples | ||
-------- | ||
>>> import numpy as np | ||
>>> d = {'animal':['Cheetah','Falcon','Eagle','Goose','Pigeon'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd have the name of the animals as the index instead. |
||
... 'class':['mammal','bird','bird','bird','bird'], | ||
... 'max_speed':[120,np.nan,320,142,150]} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please respect pep8 and have the right indentation. It'd be better if you can avoid having a separate variable The |
||
>>> df = pd.DataFrame(data=d) | ||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
|
||
>>> df | ||
animal class max_speed | ||
0 Cheetah mammal 120.0 | ||
1 Falcon bird NaN | ||
2 Eagle bird 320.0 | ||
3 Goose bird 142.0 | ||
4 Pigeon bird 150.0 | ||
|
||
The `max_speed` in sorted order:- | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any reason for the |
||
|
||
>>> df['max_speed'].sort_values(ascending=False) | ||
2 320.0 | ||
4 150.0 | ||
3 142.0 | ||
0 120.0 | ||
1 NaN | ||
Name: max_speed, dtype: float64 | ||
|
||
>>> df.quantile() | ||
max_speed 146.0 | ||
Name: 0.5, dtype: float64 | ||
|
||
The above was calculated by interpolating between available values. | ||
|
||
>>> df.quantile(q=[0.05,0.95]) | ||
max_speed | ||
0.05 123.3 | ||
0.95 294.5 | ||
>>> df.quantile(q=[0.05,0.95],interpolation="higher") | ||
max_speed | ||
0.05 142.0 | ||
0.95 320.0 | ||
>>> df.quantile(q=[0.05,0.95],interpolation="lower") | ||
max_speed | ||
0.05 120.0 | ||
0.95 150.0 | ||
""" | ||
self._check_percentile(q) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array-like
is used for an object that implements thenumpy.array
api. I think justlist
would be better here.