-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set _FillValue #181
Set _FillValue #181
Conversation
@martindurant , I thought we wanted to set |
The code linked in the issue is what I was going on |
@rabernat to choose: when creating the zarr output metadata, fill_value overwrites _FillValue, or the reverse? |
as far as I understand #177 (comment), we should always ignore the HDF5 dataset fill value ( |
@keewis , so we should only use it when _FillValue is not available? What about if the dataset is not netCDF, but regular HDF? |
according to https://docs.unidata.ucar.edu/netcdf-c/current/attribute_conventions.html:
So that means you'd use the default fill value defined by the netcdf4 library if For HDF5 datasets it seems the correct thing to do is what Re-reading the comments in #177, I get the impression that @ajelenak and @rabernat actually disagree on what to do (I might very well be misunderstanding something, though): @ajelenak would prefer to expose both |
It's not a question of what I would like or not like. My goal was to communicate how Xarray handles fill_value with Zarr.
Creating a kerchunk Zarr is analogous to Xarray writing a Zarr store. So I recommend you follow the same pattern here. If |
@martindurant, as I understand it, the new plan is to modify the PR so that in the Kerchunk JSON we:
Correct? |
to follow @rabernat's recommendation kerchunk and still keep supporting HDF datasets, I think we need to do a slight modification of that:
In any case, @martindurant: I think this PR would fix #105 as well |
I'm not sure this is a thing: all netCDF4 files are valid HDF5, but not vice versa; plus you can reasonably store netCDF datasets as parts of an HDF5 at some subpath. |
well, the point is that netcdf4 does not use or set the HDF5 fill value, as far as I can tell, so we need to figure out which fill value to use with a particular dataset. That "detection" does not have to happen in code, though, it could just as well be a parameter to |
@keewis I see your point -- if the HDF5 file is generated with NetCDF4, it will have Is there a fast/definitive way to tell if an HDF5 file was generated by NetCDF4? |
How about this version, then? Would someone like to test? |
It is probably not definitive, but https://github.com/pedro-vicente/netcdf-detect uses some internal attributes that the netcdf library sets to detect that. It is written in C++, but it should be possible to do the same using |
@keewis , so basically, check for:
|
Fixes #177