Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loc API gives KeyError: "not all values found in index" #3546

Open
roxyboy opened this issue Nov 19, 2019 · 4 comments
Open

loc API gives KeyError: "not all values found in index" #3546

roxyboy opened this issue Nov 19, 2019 · 4 comments

Comments

@roxyboy
Copy link

roxyboy commented Nov 19, 2019

I am having issues with the loc API of xarray. I am trying to select data that satisfy a certain condition. Below should be a reproducible example:

import xarray as xr
ds = xr.tutorial.load_dataset('air_temperature')
da = ds.air
mask = 2.3e2*xr.ones_like(da[0,0])
da[0,0].loc[da[0,0]<mask]

but this gives the following error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-238-75b6b6f544d4> in <module>
      1 da = ds.air
      2 mask = 2.3e2*xr.ones_like(da[0,0])
----> 3 da[0,0].loc[da[0,0]<mask]

~/miniconda3/envs/ensemble/lib/python3.7/site-packages/xarray/core/dataarray.py in __getitem__(self, key)
    194             labels = indexing.expanded_indexer(key, self.data_array.ndim)
    195             key = dict(zip(self.data_array.dims, labels))
--> 196         return self.data_array.sel(**key)
    197 
    198     def __setitem__(self, key, value) -> None:

~/miniconda3/envs/ensemble/lib/python3.7/site-packages/xarray/core/dataarray.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1045             method=method,
   1046             tolerance=tolerance,
-> 1047             **indexers_kwargs
   1048         )
   1049         return self._from_temp_dataset(ds)

~/miniconda3/envs/ensemble/lib/python3.7/site-packages/xarray/core/dataset.py in sel(self, indexers, method, tolerance, drop, **indexers_kwargs)
   1998         indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "sel")
   1999         pos_indexers, new_indexes = remap_label_indexers(
-> 2000             self, indexers=indexers, method=method, tolerance=tolerance
   2001         )
   2002         result = self.isel(indexers=pos_indexers, drop=drop)

~/miniconda3/envs/ensemble/lib/python3.7/site-packages/xarray/core/coordinates.py in remap_label_indexers(obj, indexers, method, tolerance, **indexers_kwargs)
    390 
    391     pos_indexers, new_indexes = indexing.remap_label_indexers(
--> 392         obj, v_indexers, method=method, tolerance=tolerance
    393     )
    394     # attach indexer's coordinate to pos_indexers

~/miniconda3/envs/ensemble/lib/python3.7/site-packages/xarray/core/indexing.py in remap_label_indexers(data_obj, indexers, method, tolerance)
    259             coords_dtype = data_obj.coords[dim].dtype
    260             label = maybe_cast_to_coords_dtype(label, coords_dtype)
--> 261             idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
    262             pos_indexers[dim] = idxr
    263             if new_idx is not None:

~/miniconda3/envs/ensemble/lib/python3.7/site-packages/xarray/core/indexing.py in convert_label_indexer(index, label, index_name, method, tolerance)
    191             indexer = get_indexer_nd(index, label, method, tolerance)
    192             if np.any(indexer < 0):
--> 193                 raise KeyError("not all values found in index %r" % index_name)
    194     return indexer, new_index
    195 

KeyError: "not all values found in index 'lat'"

Since the mask is identical to the xarray.DataArray da in terms of dimensions and coordinates, I don't think this error makes sense... If I change the coordinates in the following manner:

da = xr.DataArray(ds.air.data, dims=ds.air.dims, 
                  coords={'time':range(len(ds.time)),'lat':range(len(ds.lat)),'lon':range(len(ds.lon))})
mask = da[0,0] < 2.3e2*xr.ones_like(da[0,0])
da[0,0].loc[mask]

it works:

<xarray.DataArray (lon: 5)>
array([228.79999, 227.29999, 227.     , 227.5    , 228.79999],
      dtype=float32)
Coordinates:
    time     int64 0
    lat      int64 0
  * lon      (lon) int64 44 45 46 47 48

Am I missing something or is this possibly a bug? Thank you in advance for your help.

@mathause
Copy link
Collaborator

I think .loc does not take a boolean array for selection but the actual lon values you want to select. To select with a boolean array you would do:

sel = da[0, 0] < mask
da[0, 0][sel]

If you want to use .loc you first need to get the longitude values to select by:

sel_lon = da[0, 0].lon[sel]
da[0, 0].loc[sel_lon]

@roxyboy
Copy link
Author

roxyboy commented Nov 19, 2019

Thanks @mathause , your example works :) This behaviour, however, seems to be slightly different from the .loc API of pandas.DataFrame which can take boolean arrays for selection. Is there a reason for the discrepancy?

@fujiisoup
Copy link
Member

This behaviour, however, seems to be slightly different from the .loc API of pandas.DataFrame which can take boolean arrays for selection. Is there a reason for the discrepancy?

Hi, @roxyboy

This is just because that multidimensional boolean indexing is not yet implemented in xarray (#1887).
The one-dimensional indexing would work with .loc,

In [2]: da = xr.DataArray([0, 1, 2], dims=['x'])                                

In [3]: da.loc[da < 1]                                                          
Out[3]: 
<xarray.DataArray (x: 1)>
array([0])
Dimensions without coordinates: x

FYI, in xarray, probably .sel and .isel methods are more convenient than .loc, as we don't need to remember the dimension order.
For the above (my) example, I would write

da.isel(x=da < 1)

instead of da.loc[da < 1].

@roxyboy
Copy link
Author

roxyboy commented Nov 19, 2019

It is still kind of unsatisfying that my initial example fails as da is a three-dimensional array so da[0,0].loc should reduce to one-dimensional indexing but still gives the KeyError...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants