-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: setitem copy/view behavior ndarray vs Categorical vs other EA #38896
Comments
cc @jorisvandenbossche I noticed a funky behavior with IntegerArray/FloatingArray that I'd like to get your thoughts on.
in this case the _can_hold_element is pretty clearly wrong. It ends up not breaking anything because, as in the OP, the call to My question for you is: what about when we have a IntegerArray/Floating array without NAs, so can_hold_element=True is correct? Would you expect it to set the values inplace or upcast to the nullable dtype? |
When is
So currently we indeed change the dtype, but so you are saying that in case there are no nulls, the result could preserve the "int64" dtype? Since the rule for |
Gentle ping @jreback @TomAugspurger I think Joris and I are in agreement, is there concensus on the propositions that a) b) |
I think those rules are sensible. |
yep i agree here, these are the de-facto rules that we have had for quite a long time (at least implicitly). can we also write these down explicity in the docs. |
I ran into this problem when implementing it for ArrayManager, and I see now that Brock also noted it at #39163 (comment): one problem with this general rule is that you then can't set a new column positionally any longer using I think there should at least be some way to do this in all cases. The general recommendation is then that you can lookup the name with the position, like |
Agreed. Brainstorming:
|
Option 3: |
(the loc/iloc part is not yet resolved I think) |
@jorisvandenbossche we've got the 1.3 milestone on this one (added for the merged PR). do we need a new issue/remove the milestone? |
i'll remove the milestone for now. |
@jorisvandenbossche i think the remaining aspect of this is better covered in #44353 and that this is closable. pls confirm. |
Looks like the discussion stalled here if there even was remaining items. If there are follow ups probably best to open a new issue to pair down the discussion. Closing |
xref #33457 which is about similar issue but goes through different code paths.
In
Block.setitem
in cases where we are setting all the values for this block we have:So we overwrite the existing values for categorical
value
or non-EAvalue
. Example:The categorical behavior we implemented in #23393 and AFAICT the over-writing behavior was not discussed/intentional. Similarly the other EA behavior was implemented in #32479 and I don't see anything about the overwrite-or-not. I haven't tracked down the origin of the non-EA behavior.
I think all three cases should have the same behavior. We should also have the same behavior for should-be-equivalent setters, e.g. if we used iloc instead of loc, or
[:, "A"]
instead of[range(3), "A"]
.I think I agree with @TomAugspurger's comment that these should always be in-place, but not sure ATM if that can be done without breaking consistency elsewhere.
The text was updated successfully, but these errors were encountered: