Check API changes in whatsnew #34801

TomAugspurger · 2020-06-15T14:39:15Z

Our whatsnew has several sections detailing API changes:

https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.1.0.html#other-api-changes
https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.1.0.html#backwards-incompatible-api-changes. According to our version policy we shouldn't have any API-breaking changes until 2.0

Many of these, like adding DataFrame.value_counts are fine and should just be moved to a different section.

Others like #31905 will need to be looked at closely, and we'll need to determine if they're API breaking or bug fixes.

The text was updated successfully, but these errors were encountered:

TomAugspurger · 2020-06-15T14:45:25Z

I think a section like "Bug Fixes with potential API implications" might make sense. e.g. https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.1.0.html#consistency-across-groupby-reductions might fit there.

jorisvandenbossche · 2020-06-16T07:16:56Z

There are indeed a lot of items in there that are not actually API changes. For the ones in the bullet points, made an overview here with a first judgment on what I think they are.

In addition to a potential section "Bug Fixes with potential API implications", it might also make sense to have a section with "changed errors" for all cases where we made raised errors more consistent (there are quite a few in the list below). As that is only relevant for you if you actually catch errors (so most users could fully skip such a section).

This only includes the "small ones", not the API changes that have its own subsection.

jorisvandenbossche · 2020-06-16T07:31:58Z

About this one:

read_excel() no longer takes **kwds arguments. This means that passing in keyword chunksize now raises a TypeError (previously raised a NotImplementedError), while passing in keyword encoding now raises a TypeError (GH34464)

People are certainly passing encoding (incorrectly or not) to read_excel (see eg https://stackoverflow.com/q/60157255/653364, https://stackoverflow.com/a/60753334/653364). So we might want to raise a warning that encoding is ignored and will raise in the future instead.

TomAugspurger · 2020-06-16T13:12:13Z

it might also make sense to have a section with "changed errors" for all cases where we made raised errors more consistent

Do we have a stance on whether changing the exception raised counts as an API breaking change?

jreback · 2020-06-16T15:12:00Z

I think changed errors is enough, these are api breaking, but not worth deprecating

xref pandas-dev#34801

jorisvandenbossche · 2020-07-08T13:48:02Z

An overview of the other items that each have their own subsections:

MultiIndex.get_indexer interprets method argument differently (BUG: Fix reindexing with multi-indexed DataFrames #30766)
- Bug fix?
Failed Label-Based Lookups Always Raise KeyError
- Changed error
Failed Integer Lookups on MultiIndex Raise KeyError
- Changed error
DataFrame.merge() preserves right frame’s row order (BUG: right merge not preserve row order (#27453) #27762)
- Bug fix? (the PR uses BUG)
Assignment to multiple columns of a DataFrame when some columns do not exist (BUG: assignment to multiple columns when some column do not exist #29334)
- Bug fix?
Consistency across groupby reductions
- Still being discussed partly?
apply and applymap on DataFrame evaluates first row/column only once
- ?

alonme · 2020-07-09T08:57:24Z

apply and applymap on DataFrame evaluates first row/column only once

Not sure how to exactly call this,
This behavior was once noted in the documentation
and then it was removed, but not actually fixed.

TomAugspurger · 2020-07-13T16:04:03Z

apply and applymap on DataFrame evaluates first row/column only once

I think the docs made it clear that that was an implementation detail that could change. Let's call it a bugfix.

jorisvandenbossche · 2020-07-15T20:16:52Z

To summarize my thoughts here from what I recall from the dev chat about this (but probably more my opinion though).

As proposed above, add a section "Bug Fixes with potential API implications" -> here we can put bug fixes that seem important enough to give a before / after example in their own subsection (thus part of the ones that now are in the API changes section), instead of just a bullet point like the other bug fixes.
This are typically things were the previous behaviour was working but wrong (like a wrong number in the output, and not something that was erroring before), and thus more likely impact users that unknowingly used the wrong output.
Add a section to group all the changed error types: this typically doesn't impact a end-user, but rather a library developer that might be catching specific errors. For those library developers, it are technically speaking api changes, but since they all fix inconsistencies in error messages, it can also be seen as bug fixes (and as said before doesn't necessarily impact end users).
Move some of the enhancements to the enhancement section.

@Dr-Irv you also mentioned something about more clearly putting it in the policy that certain kinds of api changes/bug fixes can still happen. There is already some text about this: https://pandas.pydata.org/docs/dev/development/policies.html (the note), but if you have concrete feedback, certainly welcome.

Dr-Irv · 2020-07-15T20:54:38Z

@jorisvandenbossche That policy only says "API breaking change". What I was talking about is that there are different kinds of "API breaking changes". Here are different kinds, based on what I could think of at the moment. The list is probably longer:

Parameters to a method are different in name, order, or type
Return value of a method is different in type
Method/Class is removed or renamed due to deprecation cycle being reached
Method modifies an object in a different way, so the behavior is different, creating a type of side effect
- Could again be due to a bug fix
Return value of a method is different in value
- Could be because we fixed a bug or changed how a computation is done, so a slightly different result might appear (e.g., due to order of numerical operations)
Documentation was wrong, so the "API breaking change" from a user perspective is that we said things would work one way, but found out that the behavior was correct, but the documentation was wrong, but user code assumed the (incorrectly) documented behavior.

I think the first 3 are ones we want to reserve for major releases. The second two are a bit of grey area. I don't think the last one is an API breaking change.

So if you consider that list (and maybe others), then how you document those changes for each minor (or technical) release for each of those categories needs to be considered.

Hope this helps.

TomAugspurger · 2020-07-17T16:30:02Z

The at on that page staring with "pandas will sometimes make behavior changing" attempts to describe that nuance. Expanding that to have concrete examples would probably be welcome.

We now have a "Notable bug fixes", so for my purposes this issue can be closed. I think additional improvements to the whatsnew would be welcome though.

TomAugspurger added Docs Blocker Blocking issue or pull request for an upcoming release labels Jun 15, 2020

TomAugspurger added this to the 1.1 milestone Jun 15, 2020

jorisvandenbossche mentioned this issue Jun 16, 2020

DOC: move 'Other API changes' under correct section #34817

Merged

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Jun 18, 2020

DOC: Move API breaking to appropriate sections

bacf5ea

xref pandas-dev#34801

TomAugspurger mentioned this issue Jun 18, 2020

DOC: Move API breaking to appropriate sections #34865

Closed

TomAugspurger mentioned this issue Jul 14, 2020

Move API changes to appropriate sections #35273

Merged

TomAugspurger closed this as completed Jul 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check API changes in whatsnew #34801

Check API changes in whatsnew #34801

TomAugspurger commented Jun 15, 2020

TomAugspurger commented Jun 15, 2020

jorisvandenbossche commented Jun 16, 2020

jorisvandenbossche commented Jun 16, 2020

TomAugspurger commented Jun 16, 2020

jreback commented Jun 16, 2020

jorisvandenbossche commented Jul 8, 2020

alonme commented Jul 9, 2020 •

edited

Loading

TomAugspurger commented Jul 13, 2020

jorisvandenbossche commented Jul 15, 2020

Dr-Irv commented Jul 15, 2020

TomAugspurger commented Jul 17, 2020

Check API changes in whatsnew #34801

Check API changes in whatsnew #34801

Comments

TomAugspurger commented Jun 15, 2020

TomAugspurger commented Jun 15, 2020

jorisvandenbossche commented Jun 16, 2020

jorisvandenbossche commented Jun 16, 2020

TomAugspurger commented Jun 16, 2020

jreback commented Jun 16, 2020

jorisvandenbossche commented Jul 8, 2020

alonme commented Jul 9, 2020 • edited Loading

TomAugspurger commented Jul 13, 2020

jorisvandenbossche commented Jul 15, 2020

Dr-Irv commented Jul 15, 2020

TomAugspurger commented Jul 17, 2020

alonme commented Jul 9, 2020 •

edited

Loading