Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray.merge exception : invalid type promotion #1952

Closed
NotSqrt opened this issue Mar 2, 2018 · 4 comments · Fixed by #1953
Closed

xarray.merge exception : invalid type promotion #1952

NotSqrt opened this issue Mar 2, 2018 · 4 comments · Fixed by #1953

Comments

@NotSqrt
Copy link
Contributor

NotSqrt commented Mar 2, 2018

Code Sample

import xarray
import pandas
import numpy

array1 = xarray.DataArray(
    [numpy.timedelta64('NaT')],
    dims=['time'],
    coords={'time': pandas.to_datetime(['2018-01-01'])},
    name='foo'
)

array2 = xarray.DataArray(
    [numpy.timedelta64(30, 's')],
    dims=['time'],
    coords={'time': pandas.to_datetime(['2018-01-02'])},
    name='foo'
)

xarray.merge([array1, array2])

Problem description

Merging arrays with identical dtypes should work ...

There's some issue with the NaT being interpreted as float64, so that xarray.core.dtypes.result_type thinks that those 2 arrays are not compatible..

It works with xarray==0.10.0 and fails at xarray==0.10.1.
I've pin-pointed the issue to commit 2aa5b8a.

Work-around in the mean time :

xarray.merge([array1.astype(float), array2.astype(float)]).astype('timedelta64')

Expected Output

# expected :
xarray.DataArray(
    [numpy.timedelta64('NaT'), numpy.timedelta64(30, 's')],
    dims=['time'],
    coords={'time': pandas.to_datetime(['2018-01-01', '2018-01-02'])},
    name='foo'
)

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 3.13.0-142-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: fr_FR.UTF-8

xarray: 0.10.0+dev60.g2aa5b8a
pandas: 0.22.0
numpy: 1.14.0
scipy: None
netCDF4: 1.3.1
h5netcdf: None
h5py: None
Nio: None
zarr: None
bottleneck: None
cyordereddict: None
dask: None
distributed: None
matplotlib: 2.1.1
cartopy: None
seaborn: None
setuptools: 38.4.0
pip: 9.0.1
conda: None
pytest: 3.3.2
IPython: 6.2.1
sphinx: None

Thanks !

@shoyer
Copy link
Member

shoyer commented Mar 2, 2018

Thanks for the clear report!

This seems to boil down to an issue with xarray.core.dtypes.promote_dtype:

In [4]: from xarray.core import dtypes

In [5]: import numpy as np

In [7]: dtypes.maybe_promote(np.dtype('timedelta64[ns]'))
Out[7]: (dtype('float64'), nan)

In [9]: dtypes.maybe_promote(np.dtype('datetime64[ns]'))
Out[9]: (dtype('<M8[ns]'), numpy.datetime64('NaT'))

The source problem seems to be that numpy consider timedelta64 an integer subclass (?!?):

In [12]: issubclass(np.timedelta64, np.integer)
Out[12]: True

@shoyer
Copy link
Member

shoyer commented Mar 2, 2018

It looks like the simple fix is to reorder the check for timedelta64 in promote_dtype to be above the check for integer. Any interest in putting together a PR? :)

@NotSqrt
Copy link
Contributor Author

NotSqrt commented Mar 2, 2018

Fixing maybe_promote helps, but then on my example, there's a FutureWarning:

xarray/core/duck_array_ops.py:152: FutureWarning: In the future, 'NAT == x' and 'x == NAT' will always be False.
  flag_array = (arr1 == arr2)

@shoyer
Copy link
Member

shoyer commented Mar 2, 2018

I think that warning can be safely ignored in this case, but yes, that should also probably be silenced (see also #1652).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants