-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Replace Appender and Substitution with simpler doc decorator #31060
Changes from 14 commits
5911aa3
a04723f
b96a68f
7755c3f
4c5cd28
dd3f451
c57c51f
6ba6f9f
1cc0815
95668db
81b0888
88c6bf2
2103d61
f259bca
6a7b36c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,7 @@ | |
|
||
from pandas._libs import Timestamp, algos, hashtable as htable, lib | ||
from pandas._libs.tslib import iNaT | ||
from pandas.util._decorators import Appender, Substitution | ||
from pandas.util._decorators import doc | ||
|
||
from pandas.core.dtypes.cast import ( | ||
construct_1d_object_array_from_listlike, | ||
|
@@ -480,9 +480,32 @@ def _factorize_array( | |
return codes, uniques | ||
|
||
|
||
_shared_docs[ | ||
"factorize" | ||
] = """ | ||
@doc( | ||
values=dedent( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be possible to get rid of the need for those I suppose one option is writing those dedented. Something like
but we might consider this ugly? (and does Black allow it?) And in another this dedent could be handled by the decorator? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 on adding indentation "magic" to the decorator. Not a big fan of automatic things, but I think making the docstring injection as simple as possible, so the important code is more readable, makes it worth. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is a good idea to add this "magic" to decorator so we can keep doc simple. I am considering change from wrapper.__doc__ = wrapper._docstr_template.format(**kwargs) # type: ignore to add wrapper.__doc__ = wrapper._docstr_template.format(**{k: dedent(v) for k, v in kwargs.items()}) # type: ignore Any thoughts? Comments? Please. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds good. I am only wondering if there are some cases where the to-be-inserted string starts with a space that we want to keep (to let it fit in the docstring) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are some cases where we don't do a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good point. The changes we made here should not change the actual docstring. Since there are mixed cases of usage, it seems hard for me to get a good solution for all of them. At least nothing comes out in my mind now. I like the "magic", but for now, maybe it is better to keep it without a How about we just keep this unchanged now, and we can come back later when we have converted most decorator usages to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good, I also think it's better to start simple, see how things work, and add more features later. I think adding the automatic indentation will be worth, since it'll make things much simpler when documenting, and it's not so complex to implement in the decorator. But let's forget about it for now, this is already a nice improvement. Thanks! |
||
"""\ | ||
values : sequence | ||
A 1-D sequence. Sequences that aren't pandas objects are | ||
coerced to ndarrays before factorization. | ||
""" | ||
), | ||
sort=dedent( | ||
"""\ | ||
sort : bool, default False | ||
Sort `uniques` and shuffle `codes` to maintain the | ||
relationship. | ||
""" | ||
), | ||
size_hint=dedent( | ||
"""\ | ||
size_hint : int, optional | ||
Hint to the hashtable sizer. | ||
""" | ||
), | ||
) | ||
def factorize( | ||
values, sort: bool = False, na_sentinel: int = -1, size_hint: Optional[int] = None | ||
) -> Tuple[np.ndarray, Union[np.ndarray, ABCIndex]]: | ||
""" | ||
Encode the object as an enumerated type or categorical variable. | ||
|
||
This method is useful for obtaining a numeric representation of an | ||
|
@@ -492,10 +515,10 @@ def _factorize_array( | |
|
||
Parameters | ||
---------- | ||
%(values)s%(sort)s | ||
{values}{sort} | ||
na_sentinel : int, default -1 | ||
Value to mark "not found". | ||
%(size_hint)s\ | ||
{size_hint}\ | ||
|
||
Returns | ||
------- | ||
|
@@ -573,34 +596,6 @@ def _factorize_array( | |
>>> uniques | ||
Index(['a', 'c'], dtype='object') | ||
""" | ||
|
||
|
||
@Substitution( | ||
values=dedent( | ||
"""\ | ||
values : sequence | ||
A 1-D sequence. Sequences that aren't pandas objects are | ||
coerced to ndarrays before factorization. | ||
""" | ||
), | ||
sort=dedent( | ||
"""\ | ||
sort : bool, default False | ||
Sort `uniques` and shuffle `codes` to maintain the | ||
relationship. | ||
""" | ||
), | ||
size_hint=dedent( | ||
"""\ | ||
size_hint : int, optional | ||
Hint to the hashtable sizer. | ||
""" | ||
), | ||
) | ||
@Appender(_shared_docs["factorize"]) | ||
def factorize( | ||
values, sort: bool = False, na_sentinel: int = -1, size_hint: Optional[int] = None | ||
) -> Tuple[np.ndarray, Union[np.ndarray, ABCIndex]]: | ||
# Implementation notes: This method is responsible for 3 things | ||
# 1.) coercing data to array-like (ndarray, Index, extension array) | ||
# 2.) factorizing codes and uniques | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this decorated here for a particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to replace the variables, otherwise the values in parenthesis would be in the doc (and
doc
also keeps the original template)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This decorated to force a string format function to be applied.
I also think that this empty decorator may be a little bit miss-leading. One other possibility could be
@doc(klass="klass", others="others")
. I am not sure which one (use empty string or the arguments' name) is better here. Open and welcome for any suggestions and advice.