-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Taking first row from each group in groupby sometimes strips tzinfo #10668
Comments
And according to this comment the same thing happens if you replace cell [4] with the more pandonic line:
|
its a bug. I thought we had an issue for this already, but can't seem to find it. |
I can also confirm bug for But does work correctly for |
This is all ok on master, so all this issue needs is probably a few confirming tests. @cfperez, @louispotok interesested in a pull-request?
|
@jreback I'm working of the latest commit, and problem now is that the timestamp is wrong (exactly 8 hours off reflecting the timezone difference) even while the timezone is preserved. Note that Also, why don't these two methods return the same indices? In your example, I can add tests but still think this is a bug (and unsure how deep the rabbit hole goes.) |
ahh wasn't paying enough attention ok will mark it has a bug again then |
Dupe of #12716. |
better example I think in #15426 |
xref #12898 (same fix)
(c.f. http://stackoverflow.com/questions/31617084/how-to-have-groupby-first-not-remove-timezone-info-from-datetime-columns)
Take a dataframe with a column of tz-aware datetime.datetime objects, and group it by a different column, then return the first row from each group. There are some ways to do this that leave the datetime as it is; and then at least two ways that convert it to a tz-naive pandas Timestamp object.
And apparently
grouped.apply(lambda x: x.iloc[0])
does the same as.first()
.The text was updated successfully, but these errors were encountered: