Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: cache pip installs #1057

Merged
merged 4 commits into from
Feb 7, 2023
Merged

CI: cache pip installs #1057

merged 4 commits into from
Feb 7, 2023

Conversation

adamjstewart
Copy link
Collaborator

@adamjstewart adamjstewart commented Jan 28, 2023

@github-actions github-actions bot added the testing Continuous integration testing label Jan 28, 2023
@adamjstewart adamjstewart marked this pull request as ready for review January 30, 2023 18:33
@adamjstewart
Copy link
Collaborator Author

Unclear how useful this is since the cache is invalidated any time these files change, and most of our PRs are dependabot PRs at the moment. It's probably most useful for the minimum dependencies which get updated very infrequently.

@@ -19,6 +19,8 @@ jobs:
uses: actions/[email protected]
with:
python-version: '3.10'
cache: 'pip'
cache-dependency-path: 'requirements/style.txt'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to do anything to skip the pip install after this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fun. It looks like cache only caches downloads, not installs. Compared to develop, avoiding these downloads shaves 25 secs off the "Install pip dependencies" step, but adds 35 secs to the "Set up python" step because it has to download and unpack the cache.

@calebrob6
Copy link
Member

calebrob6 commented Feb 1, 2023

I'm not convinced the caching is doing the correct thing. E.g.

  • if I look at the run of the ubuntu 3.8 tests on this PR I see that Python setup takes 29 seconds (it is downloading stuff from the cache), and pip install takes 2min 37 seconds
  • if I look at the same run from the EuroSAT PR, Python setup takes 0 seconds and pip install takes 2min 36 seconds

It looks like the pip installs in this PR are using cached wheels somtimes (but, e.g., missing torch and downloading almost 1GB 😬 Downloading torch-1.13.1-cp38-cp38-manylinux1_x86_64.whl (887.4 MB))

@adamjstewart
Copy link
Collaborator Author

You beat me to it. Looks like there is an issue open for this: actions/setup-python#330

@adamjstewart
Copy link
Collaborator Author

So this seems slower than not caching. The only other benefit of caching is that it may help avoid download issues when a runner has internet access issues, but that seems infrequent enough that I would rather have fast installs. I vote we close this.

@calebrob6
Copy link
Member

calebrob6 commented Feb 1, 2023

There should be a way to cache the install, else what is the point? (the only reason I looked so closely is because I think this is cool!)

@adamjstewart
Copy link
Collaborator Author

We could just use the cache action to cache everything: https://blog.allenai.org/python-caching-in-github-actions-e9452698e98d

@adamjstewart adamjstewart marked this pull request as draft February 2, 2023 22:04
@adamjstewart adamjstewart mentioned this pull request Feb 4, 2023
@adamjstewart
Copy link
Collaborator Author

See actions/setup-python#330 (comment) for a matrix of install times. This is starting to look pretty good! We could still improve this by only caching the site-packages directory, but this varies by OS so I didn't try it here. Hopefully setup-python will add this feature someday, the current PR is a lot more lines than it was before.

@adamjstewart
Copy link
Collaborator Author

Other ideas for improving test times:

  • Could also cache apt/brew/pacman installs, but these are fairly quick
  • Could combine all style tests into a single test, resulting test would be slower than all other tests in parallel but would reduce the number of workers
  • Could cache the run of pytest (this is a joke)
  • Could try to speed up tests

This last one is most worthwhile imo. Our tests only take 2 min on Linux but up to 9 min on macOS/Windows for some reason. The detection trainer tests are particularly slow. There's definitely a lot of room for improvement here.

@adamjstewart adamjstewart marked this pull request as ready for review February 5, 2023 00:24
@calebrob6 calebrob6 merged commit 9f8b245 into main Feb 7, 2023
@calebrob6 calebrob6 deleted the ci/cache branch February 7, 2023 21:25
@calebrob6
Copy link
Member

Definitely worthwhile to speed up tests -- do we have an open issue for this? If not can you make one?

@adamjstewart
Copy link
Collaborator Author

#1088 is for removing internet access and downloads which is the biggest low hanging fruit.

I'm about to open a series of PRs to hopefully speed up our tutorial tests. They only get run on release, which is why it was difficult to improve them (or bother caring about them).

@adamjstewart adamjstewart added this to the 0.4.1 milestone Feb 8, 2023
calebrob6 pushed a commit that referenced this pull request Apr 10, 2023
* CI: cache pip installs

* Update to new min deps filename

* Use cache action instead

* Cache releases too
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* CI: cache pip installs

* Update to new min deps filename

* Use cache action instead

* Cache releases too
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants