Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Measuring the airspeed velocity of an unladen signac #629

Merged
merged 13 commits into from
Nov 26, 2021

Conversation

bdice
Copy link
Member

@bdice bdice commented Oct 10, 2021

Description

This PR uses asv (airspeed velocity) to measure performance of the signac package. It could replace our existing benchmark.py script. I haven't yet tried all the features of asv, but it does a lot of things we care about and is a very powerful / helpful tool:

  • isolated testing of each commit in a virtual environment (which it handles for you)
  • automated testing from git ranges: running asv run v1.3.0..master tests every commit from v1.3.0 to master (with the current commit's set of benchmarks, so it doesn't require any git trickery!)
  • publishing results into a static HTML site that can be hosted/shared: asv publish
  • local preview of the HTML site: asv preview
  • checking that the benchmark suite runs correctly during development: asv dev
  • profiling+visualization with snakeviz: e.g. asv profile 'benchmarks.ProjectBench.time_iterate_load_sp(.*)' --gui=snakeviz
  • direct comparison of two commits with asv compare
  • automatic detection of performance regressions 👀

Preview of asv results from v1.3.0 to master (this link may be removed in the future): https://glotzerlab.github.io/signac/

To-do:

Motivation and Context

I have always wanted to try asv and I think this is a great tool for understanding the project's longitudinal improvements. From a reporting/visualization/profiling perspective, asv seems like a massive step forward. Especially since releases 1.6.0 and 1.7.0 were very carefully tested for performance, it would be nice to add this tool to ensure that we keep high performance into signac 2.0 and beyond.

We can separately discuss whether to remove benchmark.py. For now, I'm happy to keep both. To get rid of benchmark.py we'll want to decide how to incorporate all the flags from benchmark.py, like those controlling document sizes, number of state point keys, etc. I think I would recommend that we just add those as parameters, and then let users alter the settings of the benchmark file if they need to test particular configurations (e.g. large documents).

Types of Changes

  • Documentation update
  • Bug fix
  • New feature
  • Breaking change1

1The change breaks (or has the potential to break) existing functionality.

Checklist:

If necessary:

  • I have updated the API documentation as part of the package doc-strings.
  • I have created a separate pull request to update the framework documentation on signac-docs and linked it here.
  • I have updated the changelog and added all related issue and pull request numbers for future reference (if applicable). See example below.

@bdice bdice requested review from a team as code owners October 10, 2021 18:28
@bdice bdice requested review from b-butler and SchoeniPhlippsn and removed request for a team October 10, 2021 18:28
@bdice bdice self-assigned this Oct 10, 2021
@bdice bdice added the enhancement New feature or request label Oct 10, 2021
@codecov
Copy link

codecov bot commented Oct 10, 2021

Codecov Report

Merging #629 (73625e9) into master (641b03b) will increase coverage by 0.39%.
The diff coverage is n/a.

❗ Current head 73625e9 differs from pull request most recent head 7b8a501. Consider uploading reports for the commit 7b8a501 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #629      +/-   ##
==========================================
+ Coverage   77.92%   78.31%   +0.39%     
==========================================
  Files          65       65              
  Lines        7079     7079              
  Branches     1310     1310              
==========================================
+ Hits         5516     5544      +28     
+ Misses       1249     1227      -22     
+ Partials      314      308       -6     
Impacted Files Coverage Δ
signac/core/h5store.py 91.06% <ø> (ø)
signac/contrib/project.py 85.15% <0.00%> (+0.36%) ⬆️
signac/__main__.py 71.89% <0.00%> (+0.55%) ⬆️
signac/contrib/filesystems.py 51.00% <0.00%> (+1.00%) ⬆️
signac/contrib/linked_view.py 83.91% <0.00%> (+1.39%) ⬆️
...nac/synced_collections/backends/collection_json.py 100.00% <0.00%> (+1.48%) ⬆️
...ed_collections/buffers/file_buffered_collection.py 99.17% <0.00%> (+1.65%) ⬆️
signac/core/dict_manager.py 88.23% <0.00%> (+2.35%) ⬆️
signac/common/host.py 40.00% <0.00%> (+2.85%) ⬆️
signac/contrib/migration/__init__.py 82.45% <0.00%> (+3.50%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 641b03b...7b8a501. Read the comment docs.

@bdice bdice added this to the v1.8.0 milestone Oct 10, 2021
@csadorf
Copy link
Contributor

csadorf commented Oct 11, 2021

That's really cool! The only thing that bothers is me is the lack of axis labels. 😄

I think we can get rid of the old benchmark tests as long as it is super easy to still run benchmarks on your computer and as part of the CI.

@bdice
Copy link
Member Author

bdice commented Oct 11, 2021

That's really cool! The only thing that bothers is me is the lack of axis labels. 😄

You can get axis labels if you click through into the full chart, just not on the overview. Example: https://glotzerlab.github.io/signac/#benchmarks.ProjectBench.time_iterate

The overview chart is called a sparkline: https://en.wikipedia.org/wiki/Sparkline 😄

Copy link
Member

@b-butler b-butler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love how GitHub colors the JSON comments red it is like Christmas with the red and green stripe.

benchmarks/benchmarks.py Outdated Show resolved Hide resolved
Comment on lines +95 to +97
def time_iterate_load_sp(self, *params):
for _ in range(10):
[job.sp() for job in self.project]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given lazy loading it would be interesting to see single pass speeds as well, unless this does it every pass (I am not too familiar with the loading that takes place here.

Copy link
Member Author

@bdice bdice Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to leave the scope of what we benchmark the same in this PR as we have currently implemented in benchmark.py. I am unsure how lazy loading / caching would work if we added another test that loads statepoints here. Ideally we want to have an empty cache when each test executes, but I can't remember if the setup/teardown are run for each test method or each test class. (It's been a few weeks since I worked on this.)

@csadorf
Copy link
Contributor

csadorf commented Nov 2, 2021

@SchoeniPhlippsn Do you think you will be able to have a look at this soon?
@bdice Regardless of the missing review, this might be ready to move forward. Any way I can help?

@bdice
Copy link
Member Author

bdice commented Nov 2, 2021

@SchoeniPhlippsn Do you think you will be able to have a look at this soon?
@bdice Regardless of the missing review, this might be ready to move forward. Any way I can help?

Yes! The next action item is to move some additional options like document size, statepoint key/value sizes, etc. into the new benchmark script. I think we can ignore the ability to set those via command line and just edit the script (and rerun asv) if those values need to be set for a particular benchmark.

Copy link

@SchoeniPhlippsn SchoeniPhlippsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late response. I missed this request for review.
I don't have any more comments. So as soon as Brandon's review has been addressed, I think this is good to go.

@bdice
Copy link
Member Author

bdice commented Nov 8, 2021

@b-butler @csadorf What do you think we should do with benchmark.py? Currently we use benchmark.py for CI testing to ensure that we haven't introduced serious regressions, so I don't want to delete it. I would vote to keep the existing benchmark.py and its corresponding CI tests. The only downside is that we will have two ways to perform benchmarks, but otherwise we have to re-implement all the CI tooling for performance tests with asv in this PR.

raise TypeError("N must be an integer!")

temp_dir = TemporaryDirectory()
project = signac.init_project(f"benchmark-N={N}", root=temp_dir.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not use (a potentially extended/adapted variant of)

def TemporaryProject(name=None, cls=None, **kwargs):
and
def init_jobs(project, nested=False, listed=False, heterogeneous=False):
here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TemporaryProject yields a project when used as a context manager. The benchmark script requires something we can store (like self.project and self.temp_dir) in setup and then destroy during the teardown method. I'm not aware of a clean way to use that context manager here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, even if you don't use TemporaryProject, you could still use the testing.init_jobs() function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I had forgotten about that function. It looks like it is currently designed for tests of projects with complex schemas, not tests of performance with varying data sizes. We would probably have to change that function to support the arguments like num_keys, num_doc_keys, data_size, data_std, etc., and that feels out of scope for this PR. I'll finalize and merge as-is, since you indicated this is a non-blocking issue.

@csadorf
Copy link
Contributor

csadorf commented Nov 8, 2021

@b-butler @csadorf What do you think we should do with benchmark.py? Currently we use benchmark.py for CI testing to ensure that we haven't introduced serious regressions, so I don't want to delete it. I would vote to keep the existing benchmark.py and its corresponding CI tests. The only downside is that we will have two ways to perform benchmarks, but otherwise we have to re-implement all the CI tooling for performance tests with asv in this PR.

I think it's fine to leave it for now, but add an in-code comment where appropriate about the purpose of each method and about the motivation to keep both.

Copy link
Member

@b-butler b-butler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this as is.

@bdice bdice requested a review from csadorf November 14, 2021 05:16
@bdice
Copy link
Member Author

bdice commented Nov 14, 2021

@csadorf Feel free to give this a final pass of review if you'd like. Otherwise it's ready to merge. I added docs here: https://signac--629.org.readthedocs.build/projects/core/en/629/support.html#benchmarking

Copy link
Contributor

@csadorf csadorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one more question that I would like to have considered, but it does not block approval.

Not sure how all of these http->https fixes made it into this PR, but I will allow it. 😆

raise TypeError("N must be an integer!")

temp_dir = TemporaryDirectory()
project = signac.init_project(f"benchmark-N={N}", root=temp_dir.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, even if you don't use TemporaryProject, you could still use the testing.init_jobs() function?

@bdice
Copy link
Member Author

bdice commented Nov 26, 2021

Not sure how all of these http->https fixes made it into this PR, but I will allow it. 😆

My mistake, I meant to push that fix directly to master and didn't notice that it was on this branch.

@bdice bdice enabled auto-merge (squash) November 26, 2021 21:55
@bdice bdice merged commit d002db2 into master Nov 26, 2021
@bdice bdice deleted the feature/asv branch November 26, 2021 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants