DOC: update the pd.DataFrame.memory_usage/empty docstring(Seoul) #20102

ohahohah · 2018-03-10T08:54:45Z

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

PR title is "DOC: update the docstring"
The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
The html version looks good: python doc/make.py --single <your-function-or-method>
It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

# paste output of "scripts/validate_docstrings.py <your-function-or-method>" here
# between the "```" (remove this comment, but keep the "```")

################################################################################
###################### Docstring (pandas.DataFrame.empty) ######################
################################################################################

True if DataFrame is empty.

True if DataFrame is entirely empty [no items], meaning any of the
axes are of length 0.

Returns
-------
empty : boolean
    if DataFrame is empty, return true, if not return false.

Notes
-----
If DataFrame contains only NaNs, it is still not considered empty. See
the example below.

Examples
--------
An example of an actual empty DataFrame. Notice the index is empty:

>>> df_empty = pd.DataFrame({'A' : []})
>>> df_empty
Empty DataFrame
Columns: [A]
Index: []
>>> df_empty.empty
True

If we only have NaNs in our DataFrame, it is not considered empty! We
will need to drop the NaNs to make the DataFrame empty:

>>> df = pd.DataFrame({'A' : [np.nan]})
>>> df
    A
0 NaN
>>> df.empty
False
>>> df.dropna().empty
True

See also
--------
pandas.Series.dropna
pandas.DataFrame.dropna

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	Missing description for See Also "pandas.Series.dropna" reference
	Missing description for See Also "pandas.DataFrame.dropna" reference


################################################################################
################## Docstring (pandas.DataFrame.memory_usage)  ##################
################################################################################

Memory usage of DataFrame columns.

Memory usage of DataFrame is accessing pandas.DataFrame.info method.
A configuration option, `display.memory_usage` (see Parameters)

Parameters
----------
index : bool
    Specifies whether to include memory usage of DataFrame's
    index in returned Series. If `index=True` (default is False)
    the first index of the Series is `Index`.
deep : bool
    Introspect the data deeply, interrogate
    `object` dtypes for system-level memory consumption.

Returns
-------
sizes : Series
    A series with column names as index and memory usage of
    columns with units of bytes.

Notes
-----
Memory usage does not include memory consumed by elements that
are not components of the array if deep=False

See Also
--------
numpy.ndarray.nbytes

Examples
--------
>>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool']
>>> data = dict([(t, np.random.randint(100, size=5000).astype(t))
...              for t in dtypes])
>>> df = pd.DataFrame(data)
>>> df.memory_usage()
Index            80
int64         40000
float64       40000
complex128    80000
object        40000
bool           5000
dtype: int64
>>> df.memory_usage(index=False)
int64         40000
float64       40000
complex128    80000
object        40000
bool           5000
dtype: int64
>>> df.memory_usage(index=True)
Index            80
int64         40000
float64       40000
complex128    80000
object        40000
bool           5000
dtype: int64
>>> df.memory_usage(index=True).sum()
205080

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	Missing description for See Also "numpy.ndarray.nbytes" reference

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

Lastly, I left errors already occurred in the previous version without changes.

jorisvandenbossche · 2018-03-10T09:46:03Z

There is a related PR on Series.memory_usage: #20086
It might be interesting to look at, to make sure to use similar explanations of the keywords.

rth

Thanks for this PR!

A few comments are below.

Also please change the "default is False" for index in the docstring which is True.

rth · 2018-03-10T09:49:20Z

pandas/core/frame.py

+        object        40000
+        bool           5000
+        dtype: int64
+        >>> df.memory_usage(index=False)


I'm not certain the two latter examples (with index=False and True) bring anything. Just the first example might be enough.

rth · 2018-03-10T09:49:59Z

pandas/core/generic.py

@@ -1436,12 +1436,20 @@ def __contains__(self, key):

    @property
    def empty(self):
-        """True if NDFrame is entirely empty [no items], meaning any of the
+        """
+        True if DataFrame is empty.


It should be """True [...] I think (no empty line)

Edit: nevermind, the official docstring example doesn't seem to do that.

rth · 2018-03-10T09:50:13Z

pandas/core/generic.py

+        Returns
+        -------
+        empty : boolean
+            if DataFrame is empty, return true, if not return false.


True, False

jreback · 2018-03-10T13:20:35Z

coordinate text with #20086

jreback · 2018-03-10T13:21:06Z

pandas/core/frame.py

+
+        Examples
+        --------
+        >>> dtypes = ['int64', 'float64', 'complex128', 'object', 'bool']


add a categorical type here as well

jreback · 2018-03-10T13:21:45Z

pandas/core/frame.py

@@ -1969,6 +1973,38 @@ def memory_usage(self, index=True, deep=False):
        See Also
        --------
        numpy.ndarray.nbytes


add Series.memory_usage
Series.nbytes

* Consistent with Series.memory_usage * Added Categorical notes [ci skip]

TomAugspurger

Merging later today.

codecov · 2018-03-15T19:57:09Z

Codecov Report

Merging #20102 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #20102      +/-   ##
==========================================
- Coverage   91.72%    91.7%   -0.03%     
==========================================
  Files         150      150              
  Lines       49149    49149              
==========================================
- Hits        45083    45071      -12     
- Misses       4066     4078      +12

Flag	Coverage Δ
#multiple	`90.08% <ø> (-0.03%)`	⬇️
#single	`41.85% <ø> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/generic.py	`95.84% <ø> (ø)`	⬆️
pandas/core/frame.py	`97.18% <ø> (ø)`	⬆️
pandas/plotting/_converter.py	`65.07% <0%> (-1.74%)`	⬇️
pandas/core/indexes/base.py	`96.66% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52cffa3...d4cc71d. Read the comment docs.

jorisvandenbossche · 2018-03-15T20:24:58Z

pandas/core/frame.py

+        The memory usage can optionally include the contribution of
+        the index and elements of `object` dtype.
+
+        A configuration option, `display.memory_usage` (see Parameters)


There seems to be missing something in this sentence.

[ci skip]

jorisvandenbossche · 2018-03-15T21:58:49Z

@ohahohah Thanks for the PR!

DOC: Improved the docstring of pd.DataFrame.memory_usage/empty

3dff081

rth reviewed Mar 10, 2018

View reviewed changes

jreback added the Docs label Mar 10, 2018

jreback requested changes Mar 10, 2018

View reviewed changes

Updates [ci skip]

fc5b498

* Consistent with Series.memory_usage * Added Categorical notes [ci skip]

TomAugspurger approved these changes Mar 15, 2018

View reviewed changes

jorisvandenbossche added 3 commits March 15, 2018 21:21

fix wrong default

b033dc6

Update generic.py

bb7f341

Update generic.py

1585a0e

jorisvandenbossche reviewed Mar 15, 2018

View reviewed changes

info [ci skip]

d4cc71d

[ci skip]

jorisvandenbossche approved these changes Mar 15, 2018

View reviewed changes

jorisvandenbossche merged commit bf9e4f3 into pandas-dev:master Mar 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: update the pd.DataFrame.memory_usage/empty docstring(Seoul) #20102

DOC: update the pd.DataFrame.memory_usage/empty docstring(Seoul) #20102

ohahohah commented Mar 10, 2018

jorisvandenbossche commented Mar 10, 2018

rth left a comment •

edited

Loading

rth Mar 10, 2018 •

edited

Loading

rth Mar 10, 2018 •

edited

Loading

rth Mar 10, 2018

jreback commented Mar 10, 2018

jreback Mar 10, 2018

jreback Mar 10, 2018

TomAugspurger left a comment

codecov bot commented Mar 15, 2018 •

edited

Loading

jorisvandenbossche Mar 15, 2018

TomAugspurger Mar 15, 2018

jorisvandenbossche commented Mar 15, 2018

DOC: update the pd.DataFrame.memory_usage/empty docstring(Seoul) #20102

DOC: update the pd.DataFrame.memory_usage/empty docstring(Seoul) #20102

Conversation

ohahohah commented Mar 10, 2018

jorisvandenbossche commented Mar 10, 2018

rth left a comment • edited Loading

Choose a reason for hiding this comment

rth Mar 10, 2018 • edited Loading

Choose a reason for hiding this comment

rth Mar 10, 2018 • edited Loading

Choose a reason for hiding this comment

rth Mar 10, 2018

Choose a reason for hiding this comment

jreback commented Mar 10, 2018

jreback Mar 10, 2018

Choose a reason for hiding this comment

jreback Mar 10, 2018

Choose a reason for hiding this comment

TomAugspurger left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 15, 2018 • edited Loading

Codecov Report

jorisvandenbossche Mar 15, 2018

Choose a reason for hiding this comment

TomAugspurger Mar 15, 2018

Choose a reason for hiding this comment

jorisvandenbossche commented Mar 15, 2018

rth left a comment •

edited

Loading

rth Mar 10, 2018 •

edited

Loading

rth Mar 10, 2018 •

edited

Loading

codecov bot commented Mar 15, 2018 •

edited

Loading