Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for h5py >=3.0.0. #411

Merged
merged 2 commits into from
Nov 3, 2020
Merged

Add support for h5py >=3.0.0. #411

merged 2 commits into from
Nov 3, 2020

Conversation

bdice
Copy link
Member

@bdice bdice commented Nov 2, 2020

Description

This PR adds support for h5py major version 3 and should be merged before #409 (to show that the code functions both before and after the upgrade).

The h5py package changed its default handling of strings in version 3. See this page for info: https://docs.h5py.org/en/latest/strings.html

This PR adds a shim to detect string data, so that it can be returned as str and not bytes. I tested this locally with h5py 2.10.0 and 3.0.0, and it should work for older versions of h5py as well (that's why I used the older function check_dtype instead of check_string_dtype, which was added in 2.10).

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

Example for a changelog entry: Fix issue with launching rockets to the moon (#101, #212).

@bdice bdice requested review from a team as code owners November 2, 2020 02:36
@bdice bdice requested review from mikemhenry and jennyfothergill and removed request for a team November 2, 2020 02:36
@codecov
Copy link

codecov bot commented Nov 2, 2020

Codecov Report

Merging #411 into master will decrease coverage by 0.04%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #411      +/-   ##
==========================================
- Coverage   76.40%   76.35%   -0.05%     
==========================================
  Files          45       45              
  Lines        7136     7139       +3     
==========================================
- Hits         5452     5451       -1     
- Misses       1684     1688       +4     
Impacted Files Coverage Δ
signac/core/h5store.py 90.75% <66.66%> (-0.26%) ⬇️
signac/contrib/project.py 90.92% <0.00%> (-0.41%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d8eeb2a...bf2ed76. Read the comment docs.

@jennyfothergill
Copy link
Contributor

jennyfothergill commented Nov 2, 2020

Hi!
Pytest is throwing an error for me. Here is a file with the error output:
pytest-error.txt
And just to make sure that I'm not doing something wrong, below are my install steps:
(in signac root dir)

gh pr checkout 411
conda env create -f environment.yml  
conda activate signac-dev
pip install -e .

environment.yml
I also tried changing the line in environment.yml to h5py>=3 and then conda gave me package not found, so I tried installing like this:

conda create -n signac-dev pip
conda activate signac-dev
pip install 'h5py>=3' pytest
pip install -e .

and I get the same error from pytest.
I'm on osx 10.14.6 (18G103)

@bdice
Copy link
Member Author

bdice commented Nov 2, 2020

@jennyfothergill

  1. Test failure: thanks for the details, that's helpful. I think this test failure is the same as H5Store pytest is failing on MacOS but passes on Linux #281. This seems to be testing a behavior that isn't very safe across different operating systems. We should probably consider removing this test since it only appears to work on Linux.

  2. h5py 3.0 isn't available on conda-forge yet, so adding it to the environment.yml won't work yet. You're an early adopter! 😄 Seems like you got it installed from PyPI and saw the same error as above.

Sounds like the takeaway is that this works except for the known issue in #281. I appreciate the extensive testing!

@jennyfothergill
Copy link
Contributor

@bdice Ah, OK I wasn't sure if that error was already known, but I should've checked the issues first! Haha, I don't know about extensive testing. :) I'm not sure how to check that strings are working correctly. (I assuming that I should test if strings work as statepoint keys/values?) I tried:

signac init myproject
signac job '{"N": 1.0, "type": "heck"}' --create

and it seemed to work fine with both versions of h5py. Is there a better way to test?

@bdice
Copy link
Member Author

bdice commented Nov 2, 2020

@jennyfothergill signac uses h5py for the H5Store class. This can be accessed with job.data, project.data, or as a standalone class signac.H5Store. This is useful for storing large arrays of values (like a big NumPy array) that are too large for a job document (stored as JSON, which is not an efficient way to store large amounts of binary data). See here for API examples: https://docs.signac.io/projects/core/en/latest/api.html#the-h5store

Copy link
Contributor

@jennyfothergill jennyfothergill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm still not sure if I'm checking this correctly, but with h5py==2.10 and h5py>=3 both print <class 'str'>:

from signac import H5Store
with H5Store('file.h5') as h5s:
    h5s['foo'] = 'bar'
    print(type(h5s.foo))

Copy link
Collaborator

@mikemhenry mikemhenry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks Jenny for that thorough review!

@bdice bdice merged commit 7b0392b into master Nov 3, 2020
@bdice bdice deleted the fix/h5py-3-support branch November 3, 2020 18:14
@bdice bdice added this to the v1.5.1 milestone Dec 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants