-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: dtlike.astype(inty) #49715
API: dtlike.astype(inty) #49715
Conversation
doc/source/whatsnew/v2.0.0.rst
Outdated
@@ -329,6 +329,7 @@ Other API changes | |||
- Default value of ``dtype`` in :func:`get_dummies` is changed to ``bool`` from ``uint8`` (:issue:`45848`) | |||
- :meth:`DataFrame.astype`, :meth:`Series.astype`, and :meth:`DatetimeIndex.astype` casting datetime64 data to any of "datetime64[s]", "datetime64[ms]", "datetime64[us]" will return an object with the given resolution instead of coercing back to "datetime64[ns]" (:issue:`48928`) | |||
- :meth:`DataFrame.astype`, :meth:`Series.astype`, and :meth:`DatetimeIndex.astype` casting timedelta64 data to any of "timedelta64[s]", "timedelta64[ms]", "timedelta64[us]" will return an object with the given resolution instead of coercing to "float64" dtype (:issue:`48963`) | |||
- :meth:`DatetimeIndex.astype`, :meth:`TimedeltaIndex.astype`, :meth:`PeriodIndex.astype` :meth:`Series.astype`, :meth:`DataFrame.astype` with ``datetime64``, ``timedelta64`` or :class:`PeriodDtype` dtypes no longer allow converting to integer dtypes other than "int64", do ``obj.astype('int64', copy=False).astype(dtype)`` instead (:issue:`??`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this right? eg you can use copy=False to actually do this? how so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
passing copy=False here just avoids doing a double-copy, since the second .astype will always copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why an exception is only made for int64? (Assuming TimedeltaIndex.astype(int) is the nanos representation), TimedeltaIndex(["1ns"])
should be able to fit in a smaller int type
Co-authored-by: Matthew Roeschke <[email protected]>
Also a suggestion in a follow up PR: Could you collect all the "dtype conversion" whatsnew notes in the |
My main concern is achieving consistency between Index/Series behavior. ATM Series raises for anything other than i8/u8 (see removed code in core.dtypes.astype). So the question is which way to go to achieve consistency. I prefer to go towards more-strict rather than less.
|
…andas into api-astype-dtlike-to-ints
If I understand #45034 correctly, it's generally concluded that it's okay to disallow casting ints except for int64? cc @jorisvandenbossche |
That is not my reading of that issue .. I mentioned several times there that I personally don't see why such restriction is needed (while I also said that I don't think this is very important, though) It's also doing something different than what we actually said would change in the deprecation warnings ("In a future version, this astype will return exactly the specified dtype instead of int64, and will raise if that conversion overflows.")
In a perfect world, there would be no |
yah, i reconsidered and changed my mind about that deprecation bc it is value-dependent behavior and in most cases just a very weird thing to .astype to and likely to be unintentional. |
Personally I don't have a strong opinion on which direction to go as this astyping is strange. I could maybe interpret a user trying to do this to try to get an epoch timestamp/total seconds(?)/not sure what for Period. Agreed that changing this to achieve consistency between Series/Index is important though. I suppose I also like the stricter Series behavior to just signal that this astyping is ill defined |
im now thinking the most likely case where a user would pass something other than int64 is if they pass we should disallow anything other than int64 (which is still more than pyarrow allows) |
I am with @jbrockmendel here, I think we should only allow |
I think int64 makes the most sense, int32 overflows real fast. I had similar issues to what @jbrockmendel mentioned while building a windows service in the past, this was super annoying. Only allowing int64 increases usability imo |
@MarcoGorelli thoughts here? |
No objections (no strong opinion either though, this isn't a topic I've delved into yet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine as is. Although it's slightly different from the original deprecation message the exception at least gives an alternative to convert to another int type
Thanks @jbrockmendel |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.