Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Inconsistency in datetime & timedelta binary operations in pandas-2.0 #52295

Closed
3 tasks done
galipremsagar opened this issue Mar 30, 2023 · 3 comments · Fixed by #52821
Closed
3 tasks done

BUG: Inconsistency in datetime & timedelta binary operations in pandas-2.0 #52295

galipremsagar opened this issue Mar 30, 2023 · 3 comments · Fixed by #52821
Assignees
Labels
Bug Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type

Comments

@galipremsagar
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [34]: s = pd.Series([1233242342344, 232432434324, 332434242344], dtype='datetime64[ms]')

In [35]: s - np.datetime64("nat", 'ms')
Out[35]: 
0   NaT
1   NaT
2   NaT
dtype: timedelta64[ns]            # Can we preserve `ms` instead of `ns` here?

In [36]: s
Out[36]: 
0   2009-01-29 15:19:02.344
1   1977-05-14 04:33:54.324
2   1980-07-14 14:50:42.344
dtype: datetime64[ms]

In [37]: s - np.datetime64("nat")
Out[37]: 
0   NaT
1   NaT
2   NaT
dtype: timedelta64[ns]                # Can we preserve `ms` instead of `ns` here?

In [38]: s + np.timedelta64("nat")
Out[38]: 
0   NaT
1   NaT
2   NaT
dtype: datetime64[ns]                 # Can we preserve `ms` instead of `ns` here?

In [39]: s + np.timedelta64("nat", 'ms')
Out[39]: 
0   NaT
1   NaT
2   NaT
dtype: datetime64[ns]                # Can we preserve `ms` instead of `ns` here?



In [42]: p = pd.Series([None, None, None], dtype='datetime64[ms]')

In [43]: p
Out[43]: 
0   NaT
1   NaT
2   NaT
dtype: datetime64[ms]

In [44]: s - p
Out[44]: 
0   NaT
1   NaT
2   NaT
dtype: timedelta64[ms]                   # This looks good.

Issue Description

When we perform a binary operation against a nat scalar, we seem to be always type-casting the result to ns resolution. Whereas incase of a nat series, we seem to be returning the expected type consistently. Can we make this behavior consistent in scalar nat cases too?

Expected Behavior

array vs array binop & array vs scalar binop must yield similar time resolutions.

Installed Versions

INSTALLED VERSIONS

commit : c2a7f1a
python : 3.10.10.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-76-generic
Version : #86-Ubuntu SMP Fri Jan 17 17:24:28 UTC 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.0.0rc1
numpy : 1.23.5
pytz : 2023.2
dateutil : 2.8.2
setuptools : 67.6.0
pip : 23.0.1
Cython : 0.29.33
pytest : 7.2.2
hypothesis : 6.70.1
sphinx : 5.3.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.11.0
pandas_datareader: None
bs4 : 4.12.0
bottleneck : None
brotli :
fastparquet : None
fsspec : 2023.3.0
gcsfs : None
matplotlib : None
numba : 0.56.4
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : 11.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2023.3.0
scipy : 1.10.1
snappy :
sqlalchemy : 1.4.46
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
zstandard : None
tzdata : None
qtpy : None
pyqt5 : None

@galipremsagar galipremsagar added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 30, 2023
@jbrockmendel
Copy link
Member

Agreed it would be nice to retain unit in these. Looks like in core.ops.array_ops.maybe_prepare_scalar_for_op we cast the numpy scalars to nanosecond unit. Disabling that fixes two of the cases above, but causes us to raise in the cases where the numpy scalars have no unit.

@DeaMariaLeon DeaMariaLeon added Timedelta Timedelta data type and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 30, 2023
@galipremsagar
Copy link
Author

but causes us to raise in the cases where the numpy scalars have no unit.

I think it's reasonable to default to ns unit when there is no unit.

@jbrockmendel
Copy link
Member

PR would be welcome

@jbrockmendel jbrockmendel added the Numeric Operations Arithmetic, Comparison, and Logical operations label Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants