-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed to join/concatenate datasets for model CESM2-FV2 #66
Comments
Thank you for using cmip6_preprocessing and bringing up this issue. I can reproduce this but am not sure yet, why this is happening. Currently I am having some issues with the pangeo cloud, but I will follow up later. |
Ok I think that I have a better idea what is happening. If I try to not concat in the member_id dimension: tos_dict_pp = cat.to_dataset_dict(
zarr_kwargs=z_kwargs,
preprocess=combined_preprocessing,
aggregate=False
) and then try to combine them manually # try to manually concat preprocessed datasets
xr.concat([a for a in tos_dict_pp.values()], dim='member_id') The error is more informative:
@andersy005 I think in general it would be nice to have a way to display these errors from within intake-esm. Is that possible currently? The specific problem here is that the lon/lat values in this particular model contain very high values (which should be masked out)! If we load the model without preprocessing and look at the latitude field tos_dict = cat.to_dataset_dict(
zarr_kwargs=z_kwargs,
preprocess=None,
aggregate=False
)
list(tos_dict.values())[0].lat.plot(vmax=200) The land points have values of 1e37. In the newest version this should be fixed here.
EDIT: I actually tried it with the newest version and it does not work. So this is a bug. |
@jbusecke I get |
That is an older version, but I just tested with the current version from github and the bug is still there. Ill see how I can fix this and once this is resolved Ill release a new version, which you can then upgrade to with |
Thank you! |
We are already doing this. It's just that @sckw didn't post the first 2/3 of the traceback which includes the ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/devel/intake/intake-esm/intake_esm/merge_util.py in join_new(dsets, dim_name, coord_value, varname, options, group_key)
55 concat_dim = xr.DataArray(coord_value, dims=(dim_name), name=dim_name)
---> 56 return xr.concat(dsets, dim=concat_dim, data_vars=varname, **options)
57 except Exception as exc:
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/xarray/core/concat.py in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs)
190 )
--> 191 return f(
192 objs, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/xarray/core/concat.py in _dataset_concat(datasets, dim, data_vars, coords, compat, positions, fill_value, join, combine_attrs)
382 datasets = [ds.copy() for ds in datasets]
--> 383 datasets = align(
384 *datasets, join=join, copy=False, exclude=[dim], fill_value=fill_value
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/xarray/core/alignment.py in align(join, copy, indexes, exclude, fill_value, *objects)
339 else:
--> 340 new_obj = obj.reindex(copy=copy, fill_value=fill_value, **valid_indexers)
341 new_obj.encoding = obj.encoding
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/xarray/core/dataset.py in reindex(self, indexers, method, tolerance, copy, fill_value, **indexers_kwargs)
2545 """
-> 2546 return self._reindex(
2547 indexers,
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/xarray/core/dataset.py in _reindex(self, indexers, method, tolerance, copy, fill_value, sparse, **indexers_kwargs)
2574
-> 2575 variables, indexes = alignment.reindex_variables(
2576 self.variables,
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/xarray/core/alignment.py in reindex_variables(variables, sizes, indexes, indexers, method, tolerance, copy, fill_value, sparse)
549 if not index.is_unique:
--> 550 raise ValueError(
551 "cannot reindex or align along dimension %r because the "
ValueError: cannot reindex or align along dimension 'y' because the index has duplicate values
The above exception was the direct cause of the following exception:
AggregationError Traceback (most recent call last)
<ipython-input-4-84f1708f03d1> in <module>
4 print(cat.df['source_id'].unique())
5 z_kwargs = {'consolidated': True, 'decode_times':False}
----> 6 tos_dict = cat.to_dataset_dict(zarr_kwargs=z_kwargs, preprocess=combined_preprocessing)
~/devel/intake/intake-esm/intake_esm/core.py in to_dataset_dict(self, zarr_kwargs, cdf_kwargs, preprocess, storage_options, progressbar, aggregate)
925 ]
926 for i, task in enumerate(concurrent.futures.as_completed(future_tasks)):
--> 927 key, ds = task.result()
928 self._datasets[key] = ds
929 if self.progressbar:
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/concurrent/futures/_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
--> 432 return self.__get_result()
433
434 self._condition.wait(timeout)
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/concurrent/futures/_base.py in __get_result(self)
386 def __get_result(self):
387 if self._exception:
--> 388 raise self._exception
389 else:
390 return self._result
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/concurrent/futures/thread.py in run(self)
55
56 try:
---> 57 result = self.fn(*self.args, **self.kwargs)
58 except BaseException as exc:
59 self.future.set_exception(exc)
~/devel/intake/intake-esm/intake_esm/core.py in _load_source(key, source)
911
912 def _load_source(key, source):
--> 913 return key, source.to_dask()
914
915 sources = {key: source(**source_kwargs) for key, source in self.items()}
~/devel/intake/intake-esm/intake_esm/source.py in to_dask(self)
244 def to_dask(self):
245 """Return xarray object (which will have chunks)"""
--> 246 self._load_metadata()
247 return self._ds
248
~/opt/miniconda3/envs/intake-esm-dev/lib/python3.8/site-packages/intake/source/base.py in _load_metadata(self)
124 """load metadata only if needed"""
125 if self._schema is None:
--> 126 self._schema = self._get_schema()
127 self.datashape = self._schema.datashape
128 self.dtype = self._schema.dtype
~/devel/intake/intake-esm/intake_esm/source.py in _get_schema(self)
173
174 if self._ds is None:
--> 175 self._open_dataset()
176
177 metadata = {
~/devel/intake/intake-esm/intake_esm/source.py in _open_dataset(self)
230 n_agg = len(self.aggregation_columns)
231
--> 232 ds = _aggregate(
233 self.aggregation_dict,
234 self.aggregation_columns,
~/devel/intake/intake-esm/intake_esm/merge_util.py in _aggregate(aggregation_dict, agg_columns, n_agg, nd, mapper_dict, group_key)
238 return ds
239
--> 240 return apply_aggregation(nd)
241
242
~/devel/intake/intake-esm/intake_esm/merge_util.py in apply_aggregation(nd, agg_column, key, level)
194 agg_options = {}
195
--> 196 dsets = [
197 apply_aggregation(value, agg_column, key=key, level=level + 1)
198 for key, value in nd.items()
~/devel/intake/intake-esm/intake_esm/merge_util.py in <listcomp>(.0)
195
196 dsets = [
--> 197 apply_aggregation(value, agg_column, key=key, level=level + 1)
198 for key, value in nd.items()
199 ]
~/devel/intake/intake-esm/intake_esm/merge_util.py in apply_aggregation(nd, agg_column, key, level)
216 if agg_type == 'join_new':
217 varname = dsets[0].attrs['intake_esm_varname']
--> 218 ds = join_new(
219 dsets,
220 dim_name=agg_column,
~/devel/intake/intake-esm/intake_esm/merge_util.py in join_new(dsets, dim_name, coord_value, varname, options, group_key)
69 """
70
---> 71 raise AggregationError(message) from exc
72
73
AggregationError:
Failed to join/concatenate datasets in group with key=CMIP.NCAR.CESM2-FV2.historical.Omon.gn along a new dimension `member_id`.
*** Arguments passed to xarray.concat() ***:
- objs: a list of 3 datasets
- dim: <xarray.DataArray 'member_id' (member_id: 3)>
array(['r1i1p1f1', 'r2i1p1f1', 'r3i1p1f1'], dtype='<U8')
Dimensions without coordinates: member_id
- data_vars: ['tos']
- and kwargs: {'coords': 'minimal', 'compat': 'override'}
******************************************** |
I use the following code to load CESM2-FV2
and I get the following error:
The text was updated successfully, but these errors were encountered: