-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build rust code only once per platform in a CI run #7047
Conversation
benjyw
commented
Jan 9, 2019
- Creates greater symmetry between Linux and OSX.
- Each platform has a bootstrap stage job that builds the native engine, a single-platform pants.pex and fs_util.
- The jobs in the test stage use the prebuilt pants.pex for the relevant platform (and/or the native_engine.so embedded in it).
- The Travis cache config has been refactored so that the bootstrap jobs only cache dirs related to building rust code, and test jobs only cache dirs related to running pants (e.g., the ivy2 cache).
- Removes the -k option from ci.sh. It was never used in practice (e.g., ccdf523 inverted the sense of the flag, but didn't change what it actually does, which seems like ample evidence that it isn't needed).
- Adds a -f option to release.sh, to build fs_util. Now a dry-run release no longer automatically builds it. This allows us to build it only in the bootstrap stage.
- Ensures that each shard has a separate Travis cache.
- The .travis.yml changes utilize a little-known but very useful feature of mustache, where if a partial is invoked on a line by itself then on substitution, every line in it receives the indentation of that line.
5fcd0f1
to
01cd87d
Compare
Reviewers: please ensure that the logic is sound, particularly around the rust builds. It probably makes sense to examine the CI run for this PR, to double-check that it does what we think it does. |
Note that there is probably some beneficial fine-tuning of the Travis caches that we can do in a followup change. Right now I erred on the side of caching less than before, because our caches were huge, and the benefit of those is questionable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks! Particularly the cache splitting and shrinking. As you note, I suspect we'll want to add some more things back to the caches, but we should consider that on a case by case basis :)
(In particular, I've been experimenting with avoiding the bootstrap shard re-compiling rust when rust code hasn't changed; I've got it mostly working in #7044 - just one more kink to work out :))
.travis.yml
Outdated
-v "${HOME}:/travis/home" | ||
-v "${TRAVIS_BUILD_DIR}:/travis/workdir" | ||
travis_ci:latest | ||
sh -c "git clean -xfd && ./build-support/bin/ci.sh && ./build-support/bin/release.sh -f" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we git clean -xfd
here and on the next shard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. It was there from before. Probably unnecessary, since we work on a clean clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's drop it unless we can justify it; it can only confuse things...
env: | ||
- CACHE_NAME=macos10.12sanity | ||
script: | ||
- MODE=debug ./build-support/bin/travis-ci.sh -bm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up TODO: swap -b from "don't bootstrap" to "do bootstrap" so it's only present for the bootstrap shards (I'll happily do it, but just noting)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If and when you do, note that the bootstrapping, and ONLY the bootstrapping, should run before we set RUN_PANTS_FROM_PEX=1
. Any other logic must follow this line, or else that logic will likely end up bootstrapping again.
@@ -221,325 +259,359 @@ matrix: | |||
repo: pantsbuild/pants | |||
|
|||
# Deploy Pex Unstable to s3 | |||
- name: "Deploy Pants PEX Unstable" | |||
- name: "Deploy unstable multiplatform pants.pex" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idle question for the future (definitely not for now): Would it make sense to merge the stable and unstable pex publish jobs? The difference between the two pexes is basically a handful of strings and hashes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually know. I was out of the loop when this was originally implemented.
name: "Rust Clippy on Linux" | ||
name: "Linux Rust Clippy" | ||
env: | ||
- CACHE_NAME=linuxclippy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job should use the native_engine_cache_config
(and maybe share it with the linux bootstrap shard)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the travis documentation, sharing caches between jobs is a no-no. Will add the cache config though.
Is this job recompiling a lot of rust code that the bootstrap job already compiled? Would it make sense to move the clippy run to there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's re-compiling all of it, but in a way which is different (both cheaper and more expensive) than the bootstrap job. It's doing compiler-integrated linting, which adds extra instrumentation (more expensive), but also doesn't actually do full codegen or linking (cheaper). It probably makes sense to merge it with the rust test running shard, actually...
name: "Rust Tests Linux" | ||
name: "Linux Rust tests" | ||
env: | ||
- CACHE_NAME=linuxrusttests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This job should use the native_engine_cache_config
(and maybe share it with the linux bootstrap shard)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. Again, is this recompiling unnecessarily? Is it better to move these tests into the bootstrap shard. Or - is it possible to propagate the bootstrapped rust build state via S3 (like we do for the pex) so that this job isn't starting from a clean rust workspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests take a non-trivial amount of time, so we probably don't want to merge them into the bootstrap shard...
The build state we'd want to share is basically the entire cache (~/.cache/pants/rust
+ src/rust/engine/target
), but it's probably noticeably simpler for us to just re-do the work in this shard than to do custom cross-shard caching...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much time might it save to run these on that build state, rather than on clean state? I mean, the whole thing is ~12 minutes from clean right now... And clippy is about the same, AFAICT. And that's without the native_engine_cache_config. So you're probably right that it's not worth worrying about...
- name: "Rust + Platform-specific Tests OSX" | ||
os: osx | ||
- <<: *default_osx_test_config | ||
name: "OSX Rust + platform-specific tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should split this shard back into two; one for rust tests (ci.sh -e) which uses the rust cache config, and possibly shares its cache with the bootstrap shard, and one which runs platform-specific tests (ci.sh -z) and uses the run cache config.
The shards were merged because the cost of bootstrapping rust was high, but now that that's ameliorated, we should split them so that they can benefit from more appropriate caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this note, should we split Self checks, lint, and JVM tests
into two shards? Self checks and lint
seem like one logical shard, and JVM tests
another logical shard. Will make it easier for us to fail more quickly if there's a linting error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, but I'd rather do that in a separate change, since this one is already hefty.
Also, as above, we can't share Travis caches, but we can maybe share the built rust state via S3? I don't know enough about what that state is in order to do so correctly, but either way, let's address in a followup?
# TODO: Figure out why we have such large caches (2-7GB) and try to trim them. | ||
timeout: 500 | ||
directories: | ||
- ${HOME}/.cache/pants/rust/cargo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could shave 100M off this dir by replacing this with:
- ${HOME}/.cache/pants/rust/cargo/.crates.toml
- ${HOME}/.cache/pants/rust/cargo/git
- ${HOME}/.cache/pants/rust/cargo/registry
- ${HOME}/.cache/pants/rust/cargo/bin/cargo-ensure-installed
- ${HOME}/.cache/pants/rust/cargo/bin/cargo-ensure-prefix
at the expense of:
- Hard coding some internals of cargo that may change
- Hard-coding a list of things we
cargo install
(so that if we add more in the future, we'd need to update this list, or they'd be re-compiled)
I'm not sure what the cost of 100M here is, so can't really comment on the trade-off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is safe to cache ${HOME}/.cache/pants/python_cache
, which has subfolders for interpreters
and for requirements
. This will allow us to avoid redownloading Python requirements every time, as we cache the wheels. Same with interpreters, as it will cache wheel
, setuptools
, and the python
executable.
The two situtations where the cache would be invalid are:
- modified
requirements.txt
. Ourcache
code will know that the cache is invalidated, though, so this is fine. - changing the name of the
venv
folder, as done in Allow Pants to run with Python 3 via./pants3
script #6959. Here, we can manually delete the cache, get the PR merged, and then the new venv folder names will be cached correctly.
Edit: after inspecting Travis, I think it's okay and possibly preferable to keep these out of the cache, as the only two shards that will benefit are the native engine builders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, but again - let's experiment with this in a followup change whose only purpose is further cache tweaking? This was a tough change, and I'm leery of perturbing it any further when it's already a win as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crazy to see how much faster CI is now. Thanks Benjy! Cool how PEX makes this all possible.
# TODO: Figure out why we have such large caches (2-7GB) and try to trim them. | ||
timeout: 500 | ||
directories: | ||
- ${HOME}/.cache/pants/rust/cargo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think is safe to cache ${HOME}/.cache/pants/python_cache
, which has subfolders for interpreters
and for requirements
. This will allow us to avoid redownloading Python requirements every time, as we cache the wheels. Same with interpreters, as it will cache wheel
, setuptools
, and the python
executable.
The two situtations where the cache would be invalid are:
- modified
requirements.txt
. Ourcache
code will know that the cache is invalidated, though, so this is fine. - changing the name of the
venv
folder, as done in Allow Pants to run with Python 3 via./pants3
script #6959. Here, we can manually delete the cache, get the PR merged, and then the new venv folder names will be cached correctly.
Edit: after inspecting Travis, I think it's okay and possibly preferable to keep these out of the cache, as the only two shards that will benefit are the native engine builders.
services: | ||
- docker | ||
env: | ||
- CACHE_NAME=linuxpexbuild |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: linux_pex_build
is easier on the eye
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The python interpreter was recently unsafe to cache because the call site makes a symlink. Practically fine if you use the system /usr/bin/python or something, but breaks for us since we bootstrap our interpreter from a virtualenv.
I haven't looked to see if that behavior has changed in the last couple months, but I ended up moving the interpreter cache under .pants.d. That allows me to continue caching all of XDG_CACHE_HOME/pants.
Since this is for Pants CI, caching is likely fine, but FYI
# We request the oldest image we can (corresponding to OSX 10.11) for maximum compatibility. | ||
# We use 10.11 as a minimum to avoid https://github.com/rust-lang/regex/issues/489. | ||
# See: https://docs.travis-ci.com/user/reference/osx/#OS-X-Version | ||
osx_image: xcode8 | ||
env: | ||
- CACHE_NAME=macospexbuild |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: macos_pex_build
- name: "Rust + Platform-specific Tests OSX" | ||
os: osx | ||
- <<: *default_osx_test_config | ||
name: "OSX Rust + platform-specific tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this note, should we split Self checks, lint, and JVM tests
into two shards? Self checks and lint
seem like one logical shard, and JVM tests
another logical shard. Will make it easier for us to fail more quickly if there's a linting error.
I've handled the comments that are easily addressable here, but I'd prefer to punt on some of the more intricate ones (splitting jobs, tweaking the cache dirs, possibly propagating rust build state from job to job) to a followup change, as this one was already several days in the making, and I'd like to checkpoint the progress so far. |
Awesome, let's merge! :D Thanks! |
Agreed on breaking out the proposed changes in separate PRs. This is a good scope and looks good to merge! I would still prefer renaming all the cache names to use |
Shard 34 seems to have appeared, called just "Ruby", and I can't work out what it is from the config... Any ideas? |
Prior to #7047 and #7012, a significant portion of CI was spent on repeatedly bootstrapping Pants. So, we combined some jobs that did not have much logical connection in order to save CI time. In general, it is acceptable to combine unrelated jobs into the same shard when there are benefits, as it does still avoid some shard startup cost. Here, however, we get some benefits from splitting these two shards into four. * "Lint, self-checks, and JVM tests" -> "Lint and self checks" + "JVM tests" - Lint fails more quickly. This is one of the shards most likely to fail, so we want to surface the failure quickly. Meanwhile, we do not often modify JVM code, so JVM tests are far less likely to fail from the average PR. - Allows us to move JVM tests later in the CI run, as again they are less likely to fail from the average PR. * "OSX Rust + platform-specific tests" -> "OSX Rust tests" + "OSX platform specific tests" - Allows us to mark "OSX Rust tests" as no pex, which saves some CI time by not installing `aws-cli` and simplifies `.travis.yml` to not need `py2` and `py3` variants of the shard. - Allows us to extract a common `base_rust_tests` config between `linux_rust_tests` and `osx_rust_tests`. - Thanks to the extracted `base_rust_tests`, we now setup `native_engine_cache` for the `osx_rust_tests` shard, which we did not have before. This caches the rust bootstrap.
Thanks to the changes made in pantsbuild#7047 and pantsbuild#7012, we went from bootstrapping in nearly every shard to only bootstrapping in the first 4 shards. So, we flip the meaning to be a positive rather than negative flag.