-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch enable_cftimeindex to True by default #2516
Changes from 1 commit
4a2acbc
2c70d32
69f6dc7
d203491
93599ca
fa729ec
939b953
1a6841d
f65b764
eaa4a44
147eace
42406e9
359016d
a72d41e
315b093
079cfdf
6d08d3b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -71,10 +71,11 @@ One unfortunate limitation of using ``datetime64[ns]`` is that it limits the | |
native representation of dates to those that fall between the years 1678 and | ||
2262. When a netCDF file contains dates outside of these bounds, dates will be | ||
returned as arrays of :py:class:`cftime.datetime` objects and a :py:class:`~xarray.CFTimeIndex` | ||
can be used for indexing. The :py:class:`~xarray.CFTimeIndex` enables only a subset of | ||
the indexing functionality of a :py:class:`pandas.DatetimeIndex` and is only enabled | ||
when using the standalone version of ``cftime`` (not the version packaged with | ||
earlier versions ``netCDF4``). See :ref:`CFTimeIndex` for more information. | ||
will be used for indexing. :py:class:`~xarray.CFTimeIndex` enables a subset of | ||
the indexing functionality of a :py:class:`pandas.DatetimeIndex` and is only | ||
fully compatible with the standalone version of ``cftime`` (not the version | ||
packaged with earlier versions ``netCDF4``). See :ref:`CFTimeIndex` for more | ||
information. | ||
|
||
Datetime indexing | ||
----------------- | ||
|
@@ -223,16 +224,26 @@ Through the standalone ``cftime`` library and a custom subclass of | |
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for | ||
dates from non-standard calendars or dates using a standard calendar, but | ||
outside the `Timestamp-valid range`_ (approximately between years 1678 and | ||
2262). This behavior has not yet been turned on by default; to take advantage | ||
of this functionality, you must have the ``enable_cftimeindex`` option set to | ||
``True`` within your context (see :py:func:`~xarray.set_options` for more | ||
information). It is expected that this will become the default behavior in | ||
xarray version 0.11. | ||
2262). | ||
|
||
For instance, you can create a DataArray indexed by a time | ||
coordinate with a no-leap calendar within a context manager setting the | ||
``enable_cftimeindex`` option, and the time index will be cast to a | ||
:py:class:`~xarray.CFTimeIndex`: | ||
.. note:: | ||
|
||
As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects | ||
will be used to represent times (either in indexes, as a | ||
:py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if | ||
any of the following are true: | ||
|
||
- The dates are from a non-standard calendar | ||
- Any dates are outside the Timestamp-valid range. | ||
|
||
Otherwise pandas-compatible dates from a standard calendar will be | ||
represented with the ``np.datetime64[ns]`` data type, enabling the use of a | ||
:py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]`` | ||
and their full set of associated features. | ||
|
||
For example, you can create a DataArray indexed by a time | ||
coordinate with dates from a no-leap calendar and a | ||
:py:class:`~xarray.CFTimeIndex` will automatically be used: | ||
|
||
.. ipython:: python | ||
|
||
|
@@ -241,27 +252,11 @@ coordinate with a no-leap calendar within a context manager setting the | |
|
||
dates = [DatetimeNoLeap(year, month, 1) for year, month in | ||
product(range(1, 3), range(1, 13))] | ||
with xr.set_options(enable_cftimeindex=True): | ||
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], | ||
name='foo') | ||
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'], name='foo') | ||
|
||
.. note:: | ||
|
||
With the ``enable_cftimeindex`` option activated, a :py:class:`~xarray.CFTimeIndex` | ||
will be used for time indexing if any of the following are true: | ||
|
||
- The dates are from a non-standard calendar | ||
- Any dates are outside the Timestamp-valid range | ||
|
||
Otherwise a :py:class:`pandas.DatetimeIndex` will be used. In addition, if any | ||
variable (not just an index variable) is encoded using a non-standard | ||
calendar, its times will be decoded into :py:class:`cftime.datetime` objects, | ||
regardless of whether or not they can be represented using | ||
``np.datetime64[ns]`` objects. | ||
|
||
xarray also includes a :py:func:`~xarray.cftime_range` function, which enables | ||
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For instance, we can | ||
create the same dates and DataArray we created above using: | ||
creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For | ||
instance, we can create the same dates and DataArray we created above using: | ||
|
||
.. ipython:: python | ||
|
||
|
@@ -317,13 +312,32 @@ For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports: | |
|
||
.. ipython:: python | ||
|
||
da.to_netcdf('example.nc') | ||
xr.open_dataset('example.nc') | ||
da.to_netcdf('example-no-leap.nc') | ||
xr.open_dataset('example-no-leap.nc') | ||
|
||
.. note:: | ||
|
||
Currently resampling along the time dimension for data indexed by a | ||
:py:class:`~xarray.CFTimeIndex` is not supported. | ||
While much of the time series functionality that is possible for standard | ||
dates has been implemented for dates from non-standard calendars, there are | ||
still some remaining important features that have yet to be implemented, | ||
for example: | ||
|
||
- Resampling along the time dimension for data indexed by a | ||
:py:class:`~xarray.CFTimeIndex` | ||
- Built-in plotting of data with :py:class:`cftime.datetime` coordinate axes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be great to link to the GitHub issues for these. |
||
|
||
If at any time you would like to restore the old default behavior, which was | ||
to attempt to decode datetimes into ``np.datetime64[ns]`` objects whenever | ||
possible (regardless of calendar type), you can set ``enable_cftimeindex`` to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would rather encourage people to convert their data to a standard calendar via normal xarray/cftime APIs. That's a cleaner solution than setting global configuration. Can we add an example of this with existing APIs? Or maybe we should augment CFTimeIndex with a new There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is a really good point and would make things a lot easier to understand. Currently we use |
||
``False`` within your context when opening a file (see | ||
:py:func:`~xarray.set_options` for more information). For some use-cases | ||
this behavior may still be useful (e.g. to allow the use of some forms | ||
resample with non-standard calendars); however in this case one should use | ||
caution to only perform operations which do not depend on differences | ||
between dates (e.g. differentiation, interpolation, or upsampling with | ||
resample), as these could introduce subtle and silent errors due to the | ||
difference in calendar types between the dates encoded in your data and the | ||
dates stored in memory. | ||
|
||
.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations | ||
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,6 +33,12 @@ v0.11.0 (unreleased) | |
Breaking changes | ||
~~~~~~~~~~~~~~~~ | ||
|
||
- The option ``enable_cftimeindex`` has now been set to ``True`` by default. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should think about eventually removing this option entirely, and making it simply a "no-op" for now (raising FutureWarning). In the long term I think we would prefer to avoid keeping around extraneous global options. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense -- that will clean up the datetime decoding logic (and tests) quite a bit. |
||
This means that by default any dates encoded using a non-standard calendar | ||
will be decoded into objects of type :py:class:`cftime.datetime`, regardless | ||
of whether or not it might be possible to coerce them into | ||
``np.datetime64[ns]`` objects. One can explicitly set the option to | ||
``False`` to restore the old behavior. | ||
- ``Dataset.T`` has been removed as a shortcut for :py:meth:`Dataset.transpose`. | ||
Call :py:meth:`Dataset.transpose` directly instead. | ||
- Iterating over a ``Dataset`` now includes only data variables, not coordinates. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,7 @@ | |
OPTIONS = { | ||
DISPLAY_WIDTH: 80, | ||
ARITHMETIC_JOIN: 'inner', | ||
ENABLE_CFTIMEINDEX: False, | ||
ENABLE_CFTIMEINDEX: True, | ||
FILE_CACHE_MAXSIZE: 128, | ||
CMAP_SEQUENTIAL: 'viridis', | ||
CMAP_DIVERGENT: 'RdBu_r', | ||
|
@@ -52,7 +52,7 @@ class set_options(object): | |
Default: ``'inner'``. | ||
- ``enable_cftimeindex``: flag to enable using a ``CFTimeIndex`` | ||
for time indexes with non-standard calendars or dates outside the | ||
Timestamp-valid range. Default: ``False``. | ||
Timestamp-valid range. Default: ``True``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would remove this now -- no need to document a deprecated/unused option. |
||
- ``file_cache_maxsize``: maximum number of open files to hold in xarray's | ||
global least-recently-usage cached. This should be smaller than your | ||
system's per-process file descriptor limit, e.g., ``ulimit -n`` on Linux. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify "non-standard calendars" -> "non-standard calendars commonly used in climate science"
This paragraph is missing a bit of context for non-climate users :)