Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC/CLN: Index shared docs with appending actual docstrings #20022

Conversation

jorisvandenbossche
Copy link
Member

xref #19932 and #20016

This removed the _index_shared_docs dict for the Index class.
There were 4 methods that actually substituted values: take, get_indexer, get_indexer_non_unique, fillna

@jorisvandenbossche
Copy link
Member Author

You can see in the second commit what actually needed to be changed in the Index docstrings.

For take and fillna: I don't think it was that much of a added value that it said the specific Index subclass (in many other docstrings we just say 'Index')
For get_indexer and get_indexer_non_unique the substituted values actually contained more information (eg target_klass was PeriodIndex or list of Periods). However, this was not always correct (eg for CategoricalIndex it does not only accept CategoricalIndex, but aslo object Index/Series/array/list) or consistent (eg for PeriodIndex we do not accept an object Index of Periods (but array or list is fine), while for CategoricalIndex we do).

If we want, I can keep a template for get_indexer / get_indexer_non_unique. But personally I don't think this is worth it (and I would rather try to make it more consistent)

@jreback
Copy link
Contributor

jreback commented Mar 7, 2018

Should clarify when should be using _shared_docs, I agree in Index subclasses we should prob try not to use it as much as its cumbersome. Though it is nice to have the correct return class rather than Index as the generic return value. Its more informative.

The problem here is that we generally do want to use shared kwargs. so its confusing that in some cases we don't and other cases we do. Should have a clear cut case here in order to NOT use them. The reason is in the future folks would just not use them, defeating the purpose of having good doc-strings. We should have a single well-defined way of doing this. Sure in some simple cases it is not needed, but then you have to decide when that is.

@jorisvandenbossche
Copy link
Member Author

The problem here is that we generally do want to use shared kwargs. so its confusing that in some cases we don't and other cases we do.

With this PR it can be quite clear: for Index subclasses we don't use template substitution, for Series/Index (from IndexOpsMixin) or Series/DataFrame (from NDFrame) we do use them (for some of the methods).

Of course the question is: do we want to use template substitution for Index in the future as well? Because indeed, this PR prevents that.
(note that special methods from Index subclasses still have their own docstring if they are defined in the subclass, so it is only for the generic methods that the docstrings would be Index-subclass agnostic)

@jreback jreback added the Docs label Mar 7, 2018
@jreback
Copy link
Contributor

jreback commented Mar 7, 2018

With this PR it can be quite clear: for Index subclasses we don't use template substitution, for Series/Index (from IndexOpsMixin) or Series/DataFrame (from NDFrame) we do use them (for some of the methods).

see that's the thing. Why would we do this? it just adds to the confusion. Index/Series are very very similar, so to do it in Series but not Index raises questions of why? and then there is the special case of IndexOpsMixin.

@jorisvandenbossche
Copy link
Member Author

Why would we do this? it just adds to the confusion.

There is not much difference in practice:

@Subtitution(..)
@Appender(NDFrame.method.__doc__)

vs

@Appender(Index.method.__doc__)

Anyhow, even for NDFrame, is there is actually no substitution, we would only do Appender without Substitution, so you would get such a mixture in frame.py/series.py anyway (as you have this mixture today as well for cases that use % and other not).

@jorisvandenbossche
Copy link
Member Author

IMO this makes it more explicit. Now often there is % _shared_docs_kwargs without that there is actually something to replace. For me, this creates confusion. So I would personally only do % _shared_docs_kwargs / @Substitution(_shared_docs_kwargs) when there is actually something to replace in the docstring.

@jreback
Copy link
Contributor

jreback commented Mar 7, 2018

@jorisvandenbossche the problem is a new contributor will have no which one to use. we need a clear unambiguous policy. it may be obvious. e.g. if the method is only defined in base.py then don't use a template / shared, otherwise do.

def take(self, indices, axis=0, allow_fill=True,
fill_value=None, **kwargs):
"""
return a new Index of the values selected by the indices
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on phrasing this as "Return a new index of the same type ..."? I suppose "of the values" implies that you're getting the same type of index back though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally we need language to assert that we are getting back the same type of Index after the op (which is almost always true), though there are exceptions. This is technically easier to have the doc-strings render like this, e.g. PeriodIndex is pretty explicty as a return type for example.

@jreback
Copy link
Contributor

jreback commented Nov 1, 2018

closing as stale. if you'd like to continue pls ping.

@jreback jreback closed this Nov 1, 2018
@jorisvandenbossche jorisvandenbossche deleted the cleanup-shared-docs-index branch November 1, 2018 08:09
@pep8speaks
Copy link

Hello @jorisvandenbossche! Thanks for updating the PR.

@jorisvandenbossche jorisvandenbossche restored the cleanup-shared-docs-index branch November 1, 2018 08:11
@jorisvandenbossche
Copy link
Member Author

I will look into rebasing this next week

@datapythonista
Copy link
Member

@jorisvandenbossche do you still want to rebase this?

@jreback
Copy link
Contributor

jreback commented Dec 14, 2018

closing as stale, but reopen if can continue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants