Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error time slicing for some CMIP6 models #3627

Closed
mikebyrne6 opened this issue Dec 16, 2019 · 10 comments
Closed

Error time slicing for some CMIP6 models #3627

mikebyrne6 opened this issue Dec 16, 2019 · 10 comments

Comments

@mikebyrne6
Copy link

Hi there,

I'm using xarray's open_zarr() function to analyse CMIP6 data:

# Function to load data: df_data has the catalogue of the variable of interest
def load_data(df_data, source_id, expt_id):
    """
    Load data for given variable, source and expt ids.
    """
    uri = df_data[(df_data.source_id == source_id) &
                  (df_data.experiment_id == expt_id)].zstore.values[0]
    
    gcs = gcsfs.GCSFileSystem(token='anon')
    ds = xr.open_zarr(gcs.get_mapper(uri), consolidated=True)
    return ds

# Just test with 1 model for now:
source_ids_tmp = ['CESM2']

for model_name in source_ids_tmp:
    print('\n\nStarting ' + model_name +'\n')
    ds_hist = load_data(df_mon_tas, model_name, experiment_ids[0]).sel(time=slice('1976', '2005'))

Problem Description

However, the time slicing fails with the following error:

TypeError: cannot compare netcdftime._netcdftime.DatetimeNoLeap(1932, 7, 15, 12, 0, 0, 0, 1, 196) and '1976'

Looking at the Dataset metedata, the time variable is described as follows:

  • time (time) object 1850-01-15 12:00:00 ... 2014-12-15 12:00:00
    time_bnds (time, nbnd) object dask.array<chunksize=(1980, 2)>

For another CMIP6 model (MIROC6), the time slicing works fine. That model has the following metadat for time:

  • time (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12-16T12:00:00
    time_bnds (time, bnds) datetime64[ns] dask.array<chunksize=(1980, 2)>

When the time variable is converted to a datetime[ns] object, time slicing seems to work. Any idea what the problem is?

Thanks,

Mike

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None libhdf5: 1.10.1 libnetcdf: 4.4.1.1

xarray: 0.14.0
pandas: 0.25.1
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.3.1
pydap: installed
h5netcdf: 0.5.0
h5py: 2.7.0
Nio: None
zarr: 2.3.2
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 0.15.3
distributed: 1.19.1
matplotlib: 3.0.0
cartopy: 0.16.0
seaborn: 0.9.0
numbagg: None
setuptools: 36.5.0.post20170921
pip: 19.0.3
conda: 4.4.6
pytest: 3.3.0
IPython: 6.1.0
sphinx: 1.6.3

@spencerkclark
Copy link
Member

spencerkclark commented Dec 16, 2019

Hi @mikebyrne6 -- it looks like cftime is not installed on your system. Could you try installing that and trying again? Time-indexing functionality for non-standard calendars through a CFTimeIndex is only supported through that.

$ conda install -c conda-forge cftime

@mikebyrne6
Copy link
Author

mikebyrne6 commented Dec 16, 2019

Hi @spencerkclark,

Thanks a lot for the quick response! Indeed I did not have cftime installed. After installing, the time-slicing error has evolved:

~/anaconda3/lib/python3.6/site-packages/xarray/coding/cftimeindex.py in _parse_iso8601_with_reso(date_type, timestr)
114 # 1.0.3.4.
115 replace["dayofwk"] = -1
--> 116 return default.replace(**replace), resolution
117
118

cftime/_cftime.pyx in cftime._cftime.datetime.replace()

ValueError: Replacing the dayofyr or dayofwk of a datetime is not supported.

Any ideas here?

Cheers,

Mike

@spencerkclark
Copy link
Member

Ah yes, there were some changes in the latest version of cftime that we needed to accommodate in xarray (see #3430). Try upgrading xarray to version 0.14.1 and I think you should be good:

$ conda upgrade -c conda-forge xarray

@mikebyrne6
Copy link
Author

Thanks again @spencerkclark! I upgraded xarray as suggested but still getting the error directly above...

@spencerkclark
Copy link
Member

Could you add a few more details to your example above so that I can try reproducing the issue? I think example values of df_mon_tas and experiment_ids would be all I need.

@mikebyrne6
Copy link
Author

The data I'm trying to work with are for CESM2 ('historical' simulation) and are stored in the Google Cloud at:

gs://cmip6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Amon/tas/gn/

@spencerkclark
Copy link
Member

Hmm...I can't seem to reproduce the issue. Are you sure you are using the latest release of xarray?

In [1]: import gcsfs; import xarray as xr

In [2]: gcs = gcsfs.GCSFileSystem(token='anon')

In [3]: mapper = gcs.get_mapper('gs://cmip6/CMIP/NCAR/CESM2/historical/r1i1p1f1/
   ...: Amon/tas/gn/')

In [4]: ds = xr.open_zarr(mapper, consolidated=True)

In [5]: ds.sel(time=slice('1975', '2005'))
Out[5]:
<xarray.Dataset>
Dimensions:    (lat: 192, lon: 288, nbnd: 2, time: 372)
Coordinates:
  * lat        (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
    lat_bnds   (lat, nbnd) float32 dask.array<chunksize=(192, 2), meta=np.ndarray>
  * lon        (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
    lon_bnds   (lon, nbnd) float32 dask.array<chunksize=(288, 2), meta=np.ndarray>
  * time       (time) object 1975-01-15 12:00:00 ... 2005-12-15 12:00:00
    time_bnds  (time, nbnd) object dask.array<chunksize=(372, 2), meta=np.ndarray>
Dimensions without coordinates: nbnd
Data variables:
    tas        (time, lat, lon) float32 dask.array<chunksize=(300, 192, 288), meta=np.ndarray>
Attributes:
    Conventions:            CF-1.7 CMIP-6.2
    activity_id:            CMIP
    branch_method:          standard
    branch_time_in_child:   674885.0
    branch_time_in_parent:  219000.0
    case_id:                15
    cesm_casename:          b.e21.BHIST.f09_g17.CMIP6-historical.001
    contact:                [email protected]
    creation_date:          2019-01-16T23:34:05Z
    data_specs_version:     01.00.29
    experiment:             all-forcing simulation of the recent past
    experiment_id:          historical
    external_variables:     areacella
    forcing_index:          1
    frequency:              mon
    further_info_url:       https://furtherinfo.es-doc.org/CMIP6.NCAR.CESM2.h...
    grid:                   native 0.9x1.25 finite volume grid (192x288 latxlon)
    grid_label:             gn
    initialization_index:   1
    institution:            National Center for Atmospheric Research, Climate...
    institution_id:         NCAR
    license:                CMIP6 model data produced by <The National Center...
    mip_era:                CMIP6
    model_doi_url:          https://doi.org/10.5065/D67H1H0V
    nominal_resolution:     100 km
    parent_activity_id:     CMIP
    parent_experiment_id:   piControl
    parent_mip_era:         CMIP6
    parent_source_id:       CESM2
    parent_time_units:      days since 0001-01-01 00:00:00
    parent_variant_label:   r1i1p1f1
    physics_index:          1
    product:                model-output
    realization_index:      1
    realm:                  atmos
    source:                 CESM2 (2017): atmosphere: CAM6 (0.9x1.25 finite v...
    source_id:              CESM2
    source_type:            AOGCM BGC
    sub_experiment:         none
    sub_experiment_id:      none
    table_id:               Amon
    tracking_id:            hdl:21.14100/d9a7225a-49c3-4470-b7ab-a8180926f839
    variable_id:            tas
    variant_info:           CMIP6 20th century experiments (1850-2014) with C...
    variant_label:          r1i1p1f1
    status:                 2019-10-25;created;by [email protected]

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 14:38:56)
[Clang 4.0.1 (tags/RELEASE_401/final)]
python-bits: 64
OS: Darwin
OS-release: 19.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: None

xarray: 0.14.1
pandas: 0.25.0
numpy: 1.17.0
scipy: 1.3.1
netCDF4: None
pydap: installed
h5netcdf: 0.7.4
h5py: 2.9.0
Nio: None
zarr: 2.3.2
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.0.25
cfgrib: 0.9.7.1
iris: None
bottleneck: 1.2.1
dask: 2.9.0+2.gd0daa5bc
distributed: 2.9.0
matplotlib: 3.1.1
cartopy: None
seaborn: 0.9.0
numbagg: installed
setuptools: 42.0.2.post20191201
pip: 19.2.2
conda: None
pytest: 5.0.1
IPython: 7.10.1
sphinx: None

@mikebyrne6
Copy link
Author

You're right, I'm stilling running the 0.14.0 version... Somehow the upgrade to 0.14.1 did not work, will try again...

@mikebyrne6
Copy link
Author

All sorted @spencerkclark, many thanks for so generously helping out an xarray beginner!

@rabernat
Copy link
Contributor

I hope you enjoy drinking our kool-aid @mikebyrne6! Thanks for a useful bug report. I predict we will be seeing more of you around here. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants