-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC clarify inplace operation section in 1.5 whats_new #47433
DOC clarify inplace operation section in 1.5 whats_new #47433
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs some editing
i
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't know if this is an improvement. can you try to make a minimal change instead.
doc/source/whatsnew/v1.5.0.rst
Outdated
|
||
.. code-block:: ipython | ||
|
||
In [3]: df.iloc[:, 0] = np.array([10, 11]) | ||
In [3]: df.iloc[:, 0] = np.array([10, 11], dtype=np.int32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the dtypes match the old (pandas < 1.5) behaviour is already to update the underlying array in place.
import numpy as np
import pandas as pd
print(f"{pd.__version__=}")
values = np.arange(4).reshape(2, 2)
df = pd.DataFrame(values)
ser = df[0]
print(f"before setting column\n{ser}")
df.iloc[:, 0] = np.array([10, 11])
print(f"after setting column\n{ser}")
Output:
pd.__version__='1.4.2'
before setting column
0 0
1 2
Name: 0, dtype: int64
after setting column
0 10
1 11
Name: 0, dtype: int64
This snippet was incorrect in pretending that ser
was not updated in place.
I used a different dtype for the assignment rhs term to be in the case, where:
- the old (pandas < 1.5) behaviour is not to update in place
- the new (pandas 1.5) behaviour is the same with an additional warning
- the future behaviour will be to update in place
doc/source/whatsnew/v1.5.0.rst
Outdated
In [4]: ser | ||
Out[4]: | ||
0 0 | ||
1 2 | ||
Name: 0, dtype: int64 | ||
|
||
This behavior is deprecated. In a future version, setting an entire column with | ||
iloc will attempt to operate inplace. | ||
*Behavior with pandas 1.5* is the same but you get a ``FutureWarning``: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leave the original note, this is not inline with our styling.
doc/source/whatsnew/v1.5.0.rst
Outdated
|
||
*Future behavior*: | ||
|
||
In a future version, setting an entire column with ``iloc`` will attempt to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is redundant
@@ -562,14 +562,15 @@ As ``group_keys=True`` is the default value of :meth:`DataFrame.groupby` and | |||
raise a ``FutureWarning``. This can be silenced and the previous behavior | |||
retained by specifying ``group_keys=False``. | |||
|
|||
.. _whatsnew_150.notable_bug_fixes.setitem_column_try_inplace: | |||
.. _whatsnew_150.deprecations.setitem_column_try_inplace: | |||
_ see also _whatsnew_130.notable_bug_fixes.setitem_column_try_inplace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side-comment: the _ see also ...
line is not rendered in the HTML, not sure whether it was supposed to be a link to https://pandas.pydata.org/docs/dev/whatsnew/v1.3.0.html#try-operating-inplace-when-setting-values-with-loc-and-iloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it looks like it's supposed to reference that section. Mind correcting it? Can be in the body of the paragraph too like Please reference `this section <whatsnew_130.notable_bug_fixes.setitem_column_try_inplace>` of the 1.3 whatsnew file.
I have tried to make the diff smaller as advised |
doc/source/whatsnew/v1.5.0.rst
Outdated
_ see also _whatsnew_130.notable_bug_fixes.setitem_column_try_inplace | ||
|
||
Try operating inplace when setting values with ``loc`` and ``iloc`` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
Inplace operation when setting values with ``iloc`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sure but it seems the changes are only about .iloc
so I remove the .loc
mention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this happens with .loc
too when setting the entire column so reverting this change. In this case the warning is a bit misleading as it mentions .iloc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, added couple of small things, but this makes sense.
If you want to spend the time, I think it'd help read this section is the example had some more "interesting" data. I think it's more difficult to understand the code with the original arbitrary code, and even more now, that one needs to understand what's going on with the types.
Maybe something like:
pandas.DataFrame({'price': [10, 15]}, index=['book1', 'boo2'])
new_prices = [11.50, 18.20]
Does this make sense?
@@ -595,7 +596,7 @@ iloc will attempt to operate inplace. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth adding ...an entire column with different type with iloc...
to the comment above?
doc/source/whatsnew/v1.5.0.rst
Outdated
|
||
*Future behavior*: | ||
*New behavior* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what we're saying, this is not new behavior, but future after the deprecation, right? Or I'm missing something? Also, there is another Future behavior
block above, that needs to be updated if I'm wrong. And the colon is there in the other behavior titles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that DataFrame.isetitem
has been added in https://github.com/pandas-dev/pandas/pull/45333/files#diff-421998af5fe0fbc54b35773ce9b6289cae3e8ae607f81783af04ebf1fbcaf077R3690 so it only exists in pandas 1.5
I did this in my last commit, I also showed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems super clear to me now, thanks a lot @lesteve
Thanks @lesteve |
) * DOC make inplace operation section in whats_new clearer * tweak * Can happen with both loc and iloc when setting entire column * Use more meaningful data Co-authored-by: Matthew Roeschke <[email protected]>
xref #47381
Coming from scikit-learn tests failing on scikit-learn with the pandas development version, I found the whats_new entry not very helpful at all. This does a few things:
ser
is not updated in place). This was not the case before, since when dtype matches,ser
is updated in place so the code snippet was not showing the right behaviour.Some of the wording may be not completely accurate, as I don't have a very good grasp of the pandas internals, feel free to suggest improvements!