-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Fix to_latex docstring. #22516
DOC: Fix to_latex docstring. #22516
Conversation
Codecov Report
@@ Coverage Diff @@
## master #22516 +/- ##
==========================================
- Coverage 92.04% 92.04% -0.01%
==========================================
Files 169 169
Lines 50787 50784 -3
==========================================
- Hits 46746 46743 -3
Misses 4041 4041
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great contribution, the docstring really needed some care. Added just some minor comments mainly about the parameter types.
pandas/core/generic.py
Outdated
Parameters | ||
---------- | ||
buf : StringIO-like, optional | ||
Buffer to write to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than StringIO-like I'd say this is a file descriptor. I guess as other methods, if None
, the output is returned as a string. I think in this cases we usually say file descriptor or None
, instead of optional, but in either case, we want to explain that if None
it returns the string.
pandas/core/generic.py
Outdated
---------- | ||
buf : StringIO-like, optional | ||
Buffer to write to. | ||
columns : sequence, optional, default None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No ened for default None
when it's optional. I think we agreed on using label
when talking about the objects in the indices, so it could make sense to have the type as list of label
.
pandas/core/generic.py
Outdated
Write row names (index). | ||
na_rep : str, default 'NaN' | ||
Missing data representation. | ||
formatters : list or dict of one-param. functions, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is a bit confusing. Don't know what it's expected, would it be list of function or dict of {str: function}
?
pandas/core/generic.py
Outdated
Formatter functions to apply to columns' elements by position or | ||
name. The result of each function must be a unicode string. | ||
List must be of length equal to the number of columns. | ||
float_format : str, default None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional
instead of default None
would make more sense in this case to me. It's not always clear, but in general we use default None
when the None
value is used as None
. When None
means the feature is not used, it's optional
.
pandas/core/generic.py
Outdated
Format string for floating point numbers. | ||
sparsify : bool, optional, default None | ||
Set to False for a DataFrame with a hierarchical index to print | ||
every multiindex key at each row. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know if None
and False
are the same here? if that's the case I'd prefer to get rid of optional
and explain in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are not the same, in formats.py
:
if sparsify is None:
sparsify = get_option("display.multi_sparse")
pandas/core/generic.py
Outdated
sparsify : bool, optional, default None | ||
Set to False for a DataFrame with a hierarchical index to print | ||
every multiindex key at each row. | ||
index_names : bool, optional, default True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary optional
pandas/core/generic.py
Outdated
Set to False for a DataFrame with a hierarchical index to print | ||
every multiindex key at each row. | ||
index_names : bool, optional, default True | ||
Prints the names of the indexes. | ||
bold_rows : boolean, default False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool
instead of boolean
pandas/core/generic.py
Outdated
When set to None, the value will default from the pandas config | ||
module. Use a longtable environment instead of tabular. Requires | ||
adding a \usepackage{longtable} to your LaTeX preamble. | ||
escape : bool, default will be read from the pandas config module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you leave here simple default None
, and explain about the config in the description?
pandas/core/generic.py
Outdated
See Also | ||
-------- | ||
DataFrame.to_csv : Write a DataFrame to CSV format. | ||
DataFrame.to_excel : Write a DataFrame to an Excel file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to_string
and to_html
in this case, which to me are conceptually more similar than to_csv
or to_excel
(formatting to present vs formatting to export).
pandas/core/generic.py
Outdated
Default: True. | ||
<https://en.wikibooks.org/wiki/LaTeX/Tables>`__ e.g. 'rcl' for 3 | ||
columns. | ||
longtable : bool, default None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be bool, optional, default None
, like on line 2558?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case it should be bool, optional
. We never use whatever_type, optional, default None
. We use only one of them.
When the None
default value is being used, we use default None
, imagine for example .fillna(value=None)
where the None
is the value used to impute.
When the None
is just a flag, then we use optional
. For example, in this case the longtable
won't get the value None
itself, but a value from the config. Meaning that it's optional to provide a longtable
value, as we can use that.
pandas/core/generic.py
Outdated
When set to None, the value will default from the pandas config | ||
module. Use a longtable environment instead of tabular. Requires | ||
adding a \usepackage{longtable} to your LaTeX preamble. | ||
escape : bool, default None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as above.
… add doc for column_format defaults
Hello @Moisan! Thanks for updating the PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run ./scripts/validate_docstrings.py pandas.DataFrame.to_latex
?
I think it will complain that the first line should be a single line (this is important for the page with the list of methods)
When the script says everything is all right, I'm happy with it. Really nice change, much better docstring now.
|
That's an error in the script them. Can you post the output of the script to see how the docstring is veing rendered, and see if it helps to see what's wrong with the script. And after that fix the docstring, so the first line (short summary) fits in a single line, as described here: https://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html |
Here is the output. I can open an issue regarding the bug of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for the contribution @Moisan
If you can open the issue for the validation, that would be great.
thanks @Moisan |
git diff upstream/master -u -- "*.py" | flake8 --diff
Fix the DataFrame.to_latex docstring to match
scripts/validate_docstrings.py
as explained in #22459 and add an example.The docstring was previously in a variable that was only used in to_latex. I put it in the method docstring instead. The
@Substitution
wasn't matching anything, I suspect this dates back to the common docstring inio/formats/format.py
.