Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Inconsistent datetime comparison with Tz #12601

Closed
sinhrks opened this issue Mar 12, 2016 · 10 comments · Fixed by #21612
Closed

BUG: Inconsistent datetime comparison with Tz #12601

sinhrks opened this issue Mar 12, 2016 · 10 comments · Fixed by #21612
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype
Milestone

Comments

@sinhrks
Copy link
Member

sinhrks commented Mar 12, 2016

Related to #8306. On current master, Timestamp comparison results in TypeError if its timezones are different. However, Index and Series implicitly converts tz to GMT

pd.Timestamp('2016-01-01 12:00', tz='US/Eastern') > pd.Timestamp('2016-01-01 08:00')
# TypeError: Cannot compare tz-naive and tz-aware timestamps

# same result as idx.tz_convert(None) > pd.Timestamp('2016-01-01 08:00')
idx = pd.date_range('2016-01-01 12:00', periods=10, freq='H', tz='Asia/Tokyo')
idx > pd.Timestamp('2016-01-01 08:00')
# array([False, False, False, False, False, False,  True,  True,  True,  True], dtype=bool)

Numeric ops raises TypeError as expected.

idx - pd.Timestamp('2016-01-01 08:00')
# TypeError: Timestamp subtraction must have the same timezones or no timezones
@sinhrks sinhrks added Bug Datetime Datetime data dtype Timezones Timezone data dtype labels Mar 12, 2016
@sinhrks sinhrks added this to the 0.18.1 milestone Mar 12, 2016
@gliptak
Copy link
Contributor

gliptak commented Mar 13, 2016

I opened numpy/numpy#7390 Does this belong to pandas instead? Thanks

@jreback
Copy link
Contributor

jreback commented Mar 13, 2016

this is an invalid dtype for numpy and not defined there
further what you are doing doesn't make any sense

@gliptak
Copy link
Contributor

gliptak commented Mar 13, 2016

@jreback I'm validating that the ts column has datetime64 with timezone (just comparing it to datetime64 fails ...). How would this need to be coded?

@jreback
Copy link
Contributor

jreback commented Mar 13, 2016

use .select_dtypes or an com.is_datetimelike or com.is_datetime64tz_dtype

numpy doesn't know about/respect this (its really a bug in the dtype definition and i don't know when/if ever will be fixed/allowed).

@jreback
Copy link
Contributor

jreback commented Mar 13, 2016

Here is also the method to coerce. EDT is not a timezone, and what dateutil is doing is wrong and doesn't give you anything useful.

In [34]: df = pd.DataFrame(["Mar 10, 2016 11:20 PM EDT"], columns=['ts'])

In [35]: pd.to_datetime(df['ts']).astype('datetime64[us, US/Eastern]')
Out[35]: 
0   2016-03-10 23:20:00-05:00
Name: ts, dtype: datetime64[ns, US/Eastern]

@gliptak
Copy link
Contributor

gliptak commented Mar 13, 2016

Thank you for the pointers.

In [4]: df = pd.DataFrame([parse("Mar 10, 2016 11:20 PM EDT")], columns=['ts'])
In [16]: df['ts'] = pd.to_datetime(df['ts']).astype('datetime64[us, US/Eastern]')
In [19]: df.dtypes['ts'] == np.dtype('datetime64[ns]')
Out[19]: False

So how am I to compare? Thanks

@jreback
Copy link
Contributor

jreback commented Mar 13, 2016

what are you trying to do? why do you need to compare? what are you comparing? most ops will simply work, you rarely actually need to compare things, if you need to sub-select use .select_dtypes(...) as I indicated.

@gliptak
Copy link
Contributor

gliptak commented Mar 13, 2016

Sorry, I didn't offer context. I came across this working unit tests for pydata/pandas-datareader#188

dtypes = [np.dtype(x) for x in ['float64', 'float64', 'datetime64[ns]']]
tm.assert_series_equal(df.dtypes, pd.Series(dtypes, index=exp_columns))

I had to force no timezone for the compare above to succeed ...
Could you show how to rewrite df.dtypes['ts'] == np.dtype('datetime64[ns]') with .select_dtypes(...)?
Thanks

@jreback
Copy link
Contributor

jreback commented Mar 13, 2016

@sinhrks
Copy link
Member Author

sinhrks commented Mar 14, 2016

@gliptak I don't quite understood also. pydata/pandas-datareader#188 is merged and the test has been passed. Pls update pydata/pandas-datareader#188 if there is any problem. I assume this issue is unrelated to yours.

@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 26, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 21, 2016
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@jreback jreback modified the milestones: Next Major Release, 0.24.0 Jun 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants