Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xr.Dataset.map drops attrs of DataArray #3595

Closed
mathause opened this issue Dec 4, 2019 · 6 comments · Fixed by #4195
Closed

xr.Dataset.map drops attrs of DataArray #3595

mathause opened this issue Dec 4, 2019 · 6 comments · Fixed by #4195

Comments

@mathause
Copy link
Collaborator

mathause commented Dec 4, 2019

MCVE Code Sample

import xarray as xr
import numpy as np

ds = xr.DataArray([1, 2], attrs=dict(tst="DataArray")).to_dataset(name="data")
ds.attrs["tst"] = "Dataset"
ds.map(np.mean, keep_attrs=True).data

returns

<xarray.DataArray 'data' ()>
array(1.5)

Expected Output

<xarray.DataArray 'data' ()>
array(1.5)
Attributes:
    tst:      DataArray

Problem Description

Applying xr.Dataset.map(..., keep_attrs=True) does not retain the attributes of the DataArrays. Should it?

In constrast ds.mean(keep_attrs=True) retains DataArray-level attrs.

EDIT: corrected example

Output of xr.show_versions()

INSTALLED VERSIONS

commit: bd4f048
python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.12.14-lp151.28.32-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.6.2

xarray: 0.14.0+117.gbd4f048b.dirty
pandas: 0.25.2
numpy: 1.17.3
scipy: 1.3.1
netCDF4: 1.5.1.2
pydap: installed
h5netcdf: 0.7.4
h5py: 2.10.0
Nio: 1.5.5
zarr: 2.3.2
cftime: 1.0.4.2
nc_time_axis: 1.2.0
PseudoNetCDF: installed
rasterio: 1.1.0
cfgrib: 0.9.7.2
iris: 2.2.0
bottleneck: 1.2.1
dask: 2.6.0
distributed: 2.6.0
matplotlib: 3.1.1
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: installed
setuptools: 41.6.0.post20191029
pip: 19.3.1
conda: None
pytest: 5.2.2
IPython: 7.9.0
sphinx: None

@dcherian
Copy link
Contributor

dcherian commented Dec 4, 2019

Looking at the code, keep_attrs controls whether the dataset attrs are kept, not the invidual variable attrs.

xarray/xarray/core/dataset.py

Lines 4233 to 4240 in 577d3a7

variables = {
k: maybe_wrap_array(v, func(v, *args, **kwargs))
for k, v in self.data_vars.items()
}
if keep_attrs is None:
keep_attrs = _get_keep_attrs(default=False)
attrs = self.attrs if keep_attrs else None
return type(self)(variables, attrs=attrs)

This may be fixable by threading keep_attrs down through maybe_wrap_array and then ultimately to Variable.__array_wrap__ when you can choose to keep attrs if keep_attrs is True.

@mathause
Copy link
Collaborator Author

mathause commented Dec 4, 2019

Yes, it works for the Dataset-level attrs and the docstring also specifies that:

keep_attrs : bool, optional
    If True, the dataset's attributes (`attrs`) will be copied from
    the original object to the new one. If False, the new object will
    be returned without attributes.

So the question is more if this is the expected behaviour?

@mathause
Copy link
Collaborator Author

mathause commented Dec 4, 2019

Sorry, read before you edited - I can try to add a PR if you agree to change this behavior.

@dcherian
Copy link
Contributor

dcherian commented Dec 4, 2019

I think that it's sensible that keep_attrs controls whether individual variable attrs are kept after mapping a function on each variable.

This is a behaviour change though. I'm not sure how to deal with that.

cc @max-sixty

@max-sixty
Copy link
Collaborator

(forgive me, am on vacation)

I actually didn't know about the multiple levels of attrs. Given that does exist, I agree with @dcherian re keep_attrs impacting individual variable attrs too.

While it is a behavior change, I think it's fine: the current behavior is somewhere between a bug and an undocumented artefact. Unless others see a more material change here?

@TomNicholas
Copy link
Member

TomNicholas commented Dec 11, 2019

When I wrote the (set_options) keep_attrs switch in then I intended it to function as a global switch, so ensuring that individual DataArray attrs are also propagated fits with that intention. I just apparently never thought of this case.

dcherian added a commit to dcherian/xarray that referenced this issue Jul 2, 2020
dcherian added a commit to dcherian/xarray that referenced this issue Aug 15, 2020
dcherian added a commit that referenced this issue Oct 14, 2020
* Propagate attrs with unary, binary functions

Closes #3490
Closes #4065
Closes #3433
Closes #3595

* Un xfail test

* bugfix

* Some progress. Still need keep_attrs in DataArray._unary_op

* Fix dataset attrs

* whats-new

* small fix

* Fix imag, real

* fix variable tests

* fix multiple return variables.

* review comments

* Update doc/whats-new.rst

* Propagate attrs with DataArray unary ops

* More tests

* Small cleanup

* Review comments.

* Fix duplication

Co-authored-by: Maximilian Roos <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants