Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import CFA from cfdm #842

Merged
merged 60 commits into from
Feb 25, 2025
Merged

Import CFA from cfdm #842

merged 60 commits into from
Feb 25, 2025

Conversation

davidhassell
Copy link
Collaborator

Fixes #841

Remove core CFA functionality, importing it from cfdm instead.

Requires NCAS-CMS/cfdm#319 from cfdm.

Aims to not change the API, other than as required by the new cfdm.

@davidhassell davidhassell added enhancement New feature or request aggregation Rerlating to metadata-based field and domain aggregation netCDF read Relating to reading netCDF datasets netCDF write Relating to writing netCDF datasets labels Jan 29, 2025
@davidhassell davidhassell added this to the NEXT VERSION milestone Jan 29, 2025
@sadielbartholomew
Copy link
Member

Just resolved a trivial (changelog items) merge conflict on the branch, so that I can finalise my review without that muddying the waters.

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, with only minor comments and suggestions raised.

Further, one general issue I noticed - though it isn't clear if it relates to the PR duo moving CFA across to cfdm, plausibly it came through from the active storage PRs a while back - is that there's some environment issue with h5netcdf relating to reading for URLS, we require h5py>=3.10.0 which I would have thought would be acceptable for this but it seems to insist on using h5pyd such that I see this error in the test suite:

======================================================================
ERROR: test_read_url (__main__.read_writeTest.test_read_url)
Test reading urls.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/slb93/git-repos/cf-python/cf/test/test_read_write.py", line 866, in test_read_url
    f = cf.read(remote)
        ^^^^^^^^^^^^^^^
  File "/home/slb93/git-repos/cfdm/cfdm/decorators.py", line 171, in verbose_override_wrapper
    return method_with_verbose_kwarg(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/slb93/git-repos/cf-python/cf/read_write/read.py", line 777, in __new__
    raise DatasetTypeError(f"\n{error}")
cfdm.read_write.exceptions.DatasetTypeError: 
Can't interpret http:///psl.noaa.gov/thredds/dodsC/Datasets/cru/crutem5/Monthlies/air.mon.anom.nobs.nc as a netCDF dataset with any of the netCDF backends ('h5netcdf', 'netCDF4'):

h5netcdf:
ImportError: No module named 'h5pyd'. h5pyd is required for opening urls: http:///psl.noaa.gov/thredds/dodsC/Datasets/cru/crutem5/Monthlies/air.mon.anom.nobs.nc

netCDF4:
OSError: [Errno -68] NetCDF: I/O failure: 'http:///psl.noaa.gov/thredds/dodsC/Datasets/cru/crutem5/Monthlies/air.mon.anom.nobs.nc'
Can't interpret http:///psl.noaa.gov/thredds/dodsC/Datasets/cru/crutem5/Monthlies/air.mon.anom.nobs.nc as a PP or UM dataset

Do you see this? Any ideas? (Happy to push it to a separate issue, though.)

Comment on lines +2749 to +2752
def dirname(path, normalise=False, uri=None, isdir=False, sep=False):
return cfdm.dirname(
path, normalise=normalise, uri=uri, isdir=isdir, sep=sep
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider using functools.partial (which if you take my suggestion would naturally need to be imported in the module), like so? It is more concise and direct than wrapping (though I'm not sure if our docstring rewriter would work for it, maybe that was a reason not to use it?).

Suggested change
def dirname(path, normalise=False, uri=None, isdir=False, sep=False):
return cfdm.dirname(
path, normalise=normalise, uri=uri, isdir=isdir, sep=sep
)
dirname = partial(cfdm.dirname, normalise=False, uri=None, isdir=False, sep=False)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea, but I don't it will work in this context. The best we could do is dirname = partial(cfdm.dirname) (if we specify the kwargs as None we can't set them positionally), and in that case the signature and name is hidden (as functools.partial(<function dirname at 0x7f8a6c2c4e00>)).

Better still would be to delete it altogether :). It's not used any more! We could deprecate it with a "use cfdm.dirname instead message". One for another PR, perhaps.

Comment on lines +2698 to +2699
def abspath(path, uri=None):
return cfdm.abspath(path, uri=uri)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment below RE possible use of functools.partial, in this case:

Suggested change
def abspath(path, uri=None):
return cfdm.abspath(path, uri=uri)
abspath = partial(cfdm.abspath, uri=None)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment #842 (comment)

davidhassell and others added 3 commits February 24, 2025 07:37
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
@davidhassell
Copy link
Collaborator Author

here's some environment issue with h5netcdf relating to reading for URLS, we require h5py>=3.10.0

we bumped cfdm to h5py>=3.12.0, and should do so here, too (d800005). Does that solve the error you were finding?

@sadielbartholomew
Copy link
Member

we bumped cfdm to h5py>=3.12.0, and should do so here, too (d800005). Does that solve the error you were finding?

I was already using 3.12.0 so that's the error emerging from that. I tried updating to 3.13.0 they released last week to see if that fixed this, but it didn't. Do you see the same error? Shall we bump this to another issue?

@davidhassell
Copy link
Collaborator Author

Oh! I'm using 3.12.1, and it all works fine for me ...

>>> import cf
>>> cf.environment(paths=False)
Platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.35
HDF5 library: 1.14.2
netcdf library: 4.9.4-development
udunits2 library: libudunits2.so.0
esmpy/ESMF: 8.7.0
Python: 3.12.8
dask: 2024.7.1
netCDF4: 1.7.2
h5netcdf: 1.5.0
h5py: 3.12.1
s3fs: 2024.12.0
psutil: 6.1.1
packaging: 24.1
numpy: 1.26.4
scipy: 1.15.1
matplotlib: 3.10.0
cftime: 1.6.4.post1
cfunits: 3.3.7
cfplot: not available
cfdm: 1.12.0.0
cf: 3.16.3

@sadielbartholomew
Copy link
Member

sadielbartholomew commented Feb 24, 2025

Thanks for the clarification. OK, so I've investigated the environment. Mine, when seeing the error, was (ignore the paths, this is from the test suite print-out):

Run date: 2025-02-24 12:56:56.118878
Platform: Linux-6.6.65-1-MANJARO-x86_64-with-glibc2.40 
HDF5 library: 1.14.2 
netcdf library: 4.9.3-development 
udunits2 library: /home/slb93/miniconda3/envs/cf-env-312/lib/libudunits2.so.0 
esmpy/ESMF: 8.4.2 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/esmpy/__init__.py
Python: 3.12.0 /home/slb93/miniconda3/envs/cf-env-312/bin/python
dask: 2024.7.1 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/dask/__init__.py
netCDF4: 1.7.1 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/netCDF4/__init__.py
h5netcdf: 1.3.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/h5netcdf/__init__.py
h5py: 3.12.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/h5py/__init__.py
s3fs: 2024.6.1 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/s3fs/__init__.py
psutil: 5.9.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/psutil/__init__.py
packaging: 24.1 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/packaging/__init__.py
numpy: 1.26.4 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/numpy/__init__.py
scipy: 1.14.0 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/scipy/__init__.py
matplotlib: 3.9.2 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/matplotlib/__init__.py
cftime: 1.6.4 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/cftime/__init__.py
cfunits: 3.3.7 /home/slb93/miniconda3/envs/cf-env-312/lib/python3.12/site-packages/cfunits/__init__.py
cfplot: 3.3.0 /home/slb93/git-repos/cf-plot/cfplot/__init__.py
cfdm: 1.12.0.0 /home/slb93/git-repos/cfdm/cfdm/__init__.py
cf: 3.16.3 /home/slb93/git-repos/cf-python/cf/__init__.py

and I bumped the main candidate, h5netcdf, to the same version as you but it didn't stop the error. However, when I bumped netcdf4 to 1.7.2 I no longer saw it - but that's bypassing the issue really since it would pass via the alternative netCDF4 option to the h5netcdf route, i.e. no longer hit this for the alternative backend case:

netCDF4:
OSError: [Errno -68] NetCDF: I/O failure: 'http:///psl.noaa.gov/thredds/dodsC/Datasets/cru/crutem5/Monthlies/air.mon.anom.nobs.nc'
Can't interpret http:///psl.noaa.gov/thredds/dodsC/Datasets/cru/crutem5/Monthlies/air.mon.anom.nobs.nc as a PP or UM dataset

since it looks like they fixed their own issue in that respect between versions (see Unidata/netcdf4-python#1335).

So, I think with your environment as quoted above, the h5-related issue is still there, but because it works with the netcdf4 route for the netCDF backend it is obscured. There is still an underlying issue - but let's bump it to its own GH Issue?

@davidhassell
Copy link
Collaborator Author

OK - thanks for tracking it down! We need to bump to netCDF4==1.7.2 for numpy2 (PR imminent), so could just leave it for that ...

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the minor feedback, which has all been covered now. Though I still have one test failure in test_docstring, it will disappear when this gets merged into main due to NCAS-CMS/cfdm#322.

Please merge when ready.

@davidhassell
Copy link
Collaborator Author

w00t.

@davidhassell davidhassell merged commit e3bc06c into NCAS-CMS:main Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aggregation Rerlating to metadata-based field and domain aggregation enhancement New feature or request netCDF read Relating to reading netCDF datasets netCDF write Relating to writing netCDF datasets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import CFA from cfdm
2 participants