Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix project data interface and add unittests for job and project data interfaces #278

Merged
merged 26 commits into from
Feb 19, 2020

Conversation

klywang
Copy link
Contributor

@klywang klywang commented Jan 27, 2020

Description

project.data used to return a new H5StoreManager each time it was called. This prevented users from reading arrays previously stored using the project.data interface.

Unit tests covered scalars and strings, but did not cover arrays or pandas dataframes.

Motivation and Context

Resolves issue #274. Arrays stored using project.data cannot be accessed using the project.data interface.
These interfaces were not tested comprehensively in unit tests.
This pull request is still in progress.

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

Example for a changelog entry: Fix issue with launching rockets to the moon (#101, #212).

@klywang klywang requested review from a team as code owners January 27, 2020 20:35
@klywang klywang requested review from zhou-pj and jennyfothergill and removed request for a team January 27, 2020 20:35
@codecov
Copy link

codecov bot commented Jan 27, 2020

Codecov Report

Merging #278 into master will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #278   +/-   ##
=======================================
  Coverage   64.87%   64.88%           
=======================================
  Files          40       40           
  Lines        5623     5624    +1     
=======================================
+ Hits         3648     3649    +1     
  Misses       1975     1975           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3debe37...df20250. Read the comment docs.

@klywang
Copy link
Contributor Author

klywang commented Jan 27, 2020

In the master branch. The following code properly returns arrays:

$ signac shell
>>> pr.data['a'] = np.zeros([10])
>>> with pr.data as store:
...     store.a[:]
... 
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

While

$ signac shell
>>> pr.data['a'] = np.zeros([10])
>>> with pr.data:
...     pr.data.a[:]
... 

returns an error. The current unit tests, which inherit from test_h5store, pass on the master branch since they test an equivelant of the first example.

If the goal is to add unit tests which can catch the bug reported in #274, would it be best to test both instances?

@bdice
Copy link
Member

bdice commented Jan 27, 2020

@klywang We may choose to merge #275 first, which replaces unittest with pytest. I think it would be easier to update this PR to use pytest after #275 is completed.

@bdice
Copy link
Member

bdice commented Jan 27, 2020

@klywang Yes, we should definitely test both cases! Both are valid and should work the same. Note that the as store: variant is using the same H5StoreManager object, which is why it works on master but the second example fails.

@bdice bdice added the bug Something isn't working label Jan 27, 2020
@bdice bdice added this to the v1.3.1 milestone Jan 27, 2020
@klywang
Copy link
Contributor Author

klywang commented Jan 27, 2020

@bdice Okay I can add tests for both instances after the testing framework has been updated.

@bdice bdice added the blocked Dependent on something else label Jan 27, 2020
Copy link
Contributor

@jennyfothergill jennyfothergill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running signac from this branch fixed issue #274 for me. I'm not sure what's happening with the ProjectStoreTests--will they be added later?

@bdice
Copy link
Member

bdice commented Jan 28, 2020

@jennyfothergill Good question! The test classes like ProjectStoreTest are actually getting all of their test logic from the parent classes. This is a case of diamond inheritance. I talked with @klywang about this in person, so I can re-cap it here. I apologize in advance for the fact that all the class names sound alike, which makes this hard to read.

The H5StoreManager itself was not the cause of the bug -- our implementation of that store manager in the Project class had a bug (but this bug wasn't in the Job class, for some reason). To prevent this from happening again, we want to test project.stores and project.data directly (also need to test job.stores and job.data in a similar way before this PR is finalized). That is, we want tests that do everything that the H5StoreManager tests do, except with project.stores (which is itself an H5StoreManager). We solve that by creating a new class:

class ProjectStoreTest(BaseProjectStoreTest, test_h5store.H5StoreTest):
    pass

Everything in this class (the setup and the tests) is defined by inheritance. It's not actually empty, as it appears. The BaseH5StoreTest contains logic shared by all H5Store related tests. The BaseProjectStoreTest inherits from BaseH5StoreTest and overrides its methods (e.g. a setUp function to create the project and methods for acquiring the data store, by returning project.data). However, BaseH5StoreTest and BaseProjectStoreTest don't actually have any tests. In test_h5store.py, the actual tests are defined in H5StoreTest (which inherits from BaseH5StoreTest) and other similar classes. Then in the ProjectStoreTest class, the setup logic comes from BaseProjectStoreTest and the actual tests come from test_h5store.H5StoreTest and similar classes.

In picture form:
image

FYI, this picture (and the class names in particular) may change when we shift to pytest.

@jennyfothergill
Copy link
Contributor

Oh! Wow, thank you so much for the great explanation! I think I understand.

@bdice bdice removed the blocked Dependent on something else label Jan 29, 2020
@bdice
Copy link
Member

bdice commented Jan 29, 2020

@klywang I updated this PR to match pytest. The tests appear to be running correctly with pytest. Next steps:

  • pull the latest version of this branch
  • add the second test instance you were mentioning above to test_h5store.py so it will be used by both sets of tests
  • implement the same kind of inherited test classes for the test_job.py

If you want to run the tests locally, you'll need to install pytest and pytest-subtests. Then you can run pytest test_project.py, etc.

Use pytest test_project.py -k "TestProjectStore" -v to run just the store-related tests.

@klywang
Copy link
Contributor Author

klywang commented Feb 9, 2020

This should be for review now.

Copy link
Member

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed this PR again -- it looks good to me. The only missing part was a changelog entry, so I added one. Thanks @klywang! This PR needs one more approval from @zhou-pj or @jennyfothergill to proceed.

Copy link
Contributor

@zhou-pj zhou-pj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Just have a few minor comments and clarification questions.

tests/test_project.py Outdated Show resolved Hide resolved
tests/test_project.py Outdated Show resolved Hide resolved
project_class = signac.Project

@pytest.fixture(autouse=True)
def setUp_base_h5Store(self, request):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor questions and not directly introduced by this PR, but I noticed a few functions with the @pytest.fixture that are using a mixture of camelCase and underscore for function name. Is there a reason for that?

tests/test_project.py Show resolved Hide resolved
@zhou-pj zhou-pj merged commit 6b28cbb into master Feb 19, 2020
@zhou-pj zhou-pj deleted the fix/project-data-interface branch February 19, 2020 00:54
@bdice bdice modified the milestones: v1.3.1, v1.4.0 Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants