-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Harmonize drop and rename API #12392
Comments
actually this is a bigger API issue that @TomAugspurger and I briefly touched on here
So thoughts on how to proceed here. I'd rather not make add hoc changes, rather try to construct an overall consistent way of doing things; we can certainly provide back-compat, but unifying things is probably a good thing. |
drop
and rename
API
I don't have a strong preference for one style over the other. The only upshot of the I would slightly favor just recommending and documenting the |
do you think we should add corresponding |
Personally, I have a preference for Just documenting the I'm agnostic on adding If we change the |
why don't you list all of the relevant methods (might be some more that I am forgetting), and make a proposal. |
OK |
Others:
Open question:
|
see that's the problem. In reality we should leave everything alone and maybe just change
though we are actually flexible enough to accept both paradigms. |
Oh, I don't really care about the "two things at once" -- I just liked the "columns" argument for being more meaningful. So your preference is:
That's fine by me -- like I said, I'm mostly interested in harmonization! |
+1 And, I know people have gone back & forth on this a bit - but I would also 'vote' for:
|
I agree that changing 2 things at once is not a great API, but I agree with @nickeubank that explicit |
I would like to avoid adding new methods as Further, I think we should make a clear distinction between methods that modify the axis (rename, drop, reindex), and methods that perform operation over a certain axis ( I personally also like the explicit It is not really good API design, but I think it is perfectly possible to combine both idioms in one method for all of the discussed functions as kind of a compromise?
Which would be I think backwards compatible? |
@MaximilianR Maybe open a separate issue to discuss that? What kind of idiom to use in the signature maybe depends on this, but the question of adding such a method is separate discussion I think. |
I think that @jorisvandenbossche's suggestion works perfectly. The real brilliance is that it even works for someone who used positional arguments for rename (i.e. typed |
I take that back – if somebody uses more than one positional argument ( On further reflection, I think we only have two choices: break the API, or tack the new arguments on to the end of the argument list so anyone who uses positional arguments is OK. |
I think even that should be possible to detect and warn. If the user did originally |
@jorisvandenbossche My impression was that "backwards compatibility" / "not breaking the API" means that old code still runs fine -- an informative error beats a silent failure, but seems like that's still API-breaking. An overview of where I think we stand: 1. Do nothing 2. Backwards Compatible
Pros: Cons: 3. Break-API - All options available
Pros: Cons: 4. Break-API -- adopt
Pros: Cons: 5. Break-API -- adopt
Cons: My take: I think we should shoot for either 2 (to ensure backwards compatibility) or 4. 2 because I think api breaking for these kind of core functions is bad, and 4 because I'm increasingly won over by @jreback's argument -- while I prefer |
Nice overview!
@nickeubank An informative message does not necessarily need to be an error! It can also be a warning (or we can even decide to just pass it through correctly without warning, although I wouldn't do that). So I am still convinced this can be done in a backwards compatible way (and your options 2 and 3 can be combined).
I don't think this is really a con, as using it with only positional arguments is never a sane thing to do regarding clarity of your code :-) Further, I think there is 6th option: use separate methods for the two idioms (like So I think we have to choose between: a) combine both idioms within the same methods and live with the bad API design (in a back compat or incompat way -> your options 2 and 3) I would personally be in favor of a) |
@jorvisvandenbossche good call about positional argument differences not being a big deal. I think that makes my 2 (backwards compatible with both sets or arguments) my preference. |
@nickeubank can you survey all the methods and see which use each idiom? kind of like a value_counts, most important is prob number per class of idiom. (e.g. make several categories and measure how many methods of each type of idiom we have for both). Just to get an overview of the entire API. |
@jreback Sure, but will need some time -- busy week! |
@nickeubank np. this issue would be for 0.19.0 in any event. |
A DataFrame has ~200 methods. Those that take
Also note that
In light of that, I would vote for leaving |
Revisiting this, seems like we came to a consensus on two things then got stuck. Consensus:
No Consensus:Seems we have three options: Option 1: Add
Cons:
Option 2: Put Pros:
Cons:
Option 3: Replace
Cons:
Personally, I like 1 or 2 (though my indifference between the two is partially motivated by the fact I always name my arguments so they're equivalent for me ;)) |
I think having two methods doing the same thing is confusing (less so if the documentation of each just clarified the difference from the other, but still I don't think both are worth keeping).
Done: #16990 . Clearly this discussion on the signature also applies to that bug, assuming my proposal (of adding |
@jorisvandenbossche any possibility of getting this in? obviously aside from #17644 which is merged |
What's left to do here? The same changes to If so, I can put together a PR this afternoon. |
yep i think so; that’s a bit more involved though |
Yes, I was just going to post that :) I may have found a (somewhat) hacky solution. Will have the start of a PR in a bit. The difficulty is disambiguating >>> df.rename(fn, axis=1) # OK
>>> df.rename(index=fn, axis=1) # TypeError But I may have a way. |
How much to want to do the other side of this though? As I'm writing the release notes for adding I'm comfortable with recommending |
I think that @nickeubank 's comment provides strong evidence in favor of Keeping both approaches where |
Yes, re-reading that comment does make a good case for it. OK then, I'll put up my WIP for rename, and finish it up later tonight. |
(By the way: something else good, and very pythonic, about |
I disagree with that comment (#12392 (comment)): it is correct that the But anyhow, that's not really that relevant anymore :-) As it is good to make them consistent anyway, which means adding @TomAugspurger Thanks for picking this up! Will look at the PR now. |
* API: Added axis argument to rename xref: #12392 * API: Accept 'axis' keyword argument for reindex
Were |
* API: Added axis argument to rename xref: pandas-dev#12392 * API: Accept 'axis' keyword argument for reindex
* API: Added axis argument to rename xref: pandas-dev#12392 * API: Accept 'axis' keyword argument for reindex
* API: Added axis argument to rename xref: pandas-dev#12392 * API: Accept 'axis' keyword argument for reindex
rename
accepts acolumns
argument or anindex
argument, whiledrop
looks for alabels
andaxis
pair. I don't know about anyone else, but I have to check the help file every time I come back to pandas to remember which takes which.How would people feel about adding
columns
andindex
arguments todrop
? They could just be added in addition tolabels
/axis
if we want to provide backwards compatibility and just raise an exception if the user tries to mix them.The text was updated successfully, but these errors were encountered: