-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Hermetic sandbox with support for hardlinks #13279
Implement Hermetic sandbox with support for hardlinks #13279
Conversation
Related to Issue Make the sandboxed file system more strict: #7313 |
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
d3ba024
to
0a9c8f7
Compare
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
It is not clear we had agreement on answer to #7313, so this PR might be premature. |
This comment has been minimized.
This comment has been minimized.
This is the results from a benchmark on how long one of our builds takes with respective sandbox. Sandbox type | Average build-time |
I will let philwo handle this review, since I'm not that familiar with sandboxes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite done with the C++/Bash parts yet, but here's a couple of notes.
Could you fix the tests? |
Thanks for your review comments. Regarding the test I thought I knew what to fix, turns out I was wrong. |
Click the |
I seem to have solved most of the tests, but how do I prevent "buildkite/bazel-bazel-github-presubmit/darwin-openjdk-8-shard-2" from running my hermetic "linux-sandbox" test. I assume linux-sandbox is not supported on that one, and therefore the tests will be failing |
The tests are passing now, I have dealt with @larsrc-google initial review comments. I would appreciate some more review comments @philwo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few fixes that make it easier for me to test further revisions, and a couple of clarification questions. All tests pass, though.
src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/sandbox/HardlinkedSandboxedSpawn.java
Outdated
Show resolved
Hide resolved
|
||
# For the test to work we need to bind mount a couple of folders to | ||
# get access to bash, ls, python etc. Depending on linux distribution | ||
# these folders may vary. Mount all folders in the root directory '/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Remove empty trailing whitespace. If your editor has an option to remove trailing whitespace, I suggest turning that on.
src/main/java/com/google/devtools/build/lib/sandbox/HardlinkedSandboxedSpawn.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/sandbox/AbstractSandboxSpawnRunner.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/actions/FileContentsProxy.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/actions/FileContentsProxy.java
Show resolved
Hide resolved
The test failing public void testUnreadableFileWithNoFastDigest() throws Exception expects that setting a new modified time should not affect this assert. p.setLastModifiedTime(10L); But I just added mtime to the hash/equal/fingerprint and prettyprintt so this is expected. What should I do about this test? |
@janakdr is probably the one who would know about the mtime handling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michajlo wrote that test, he may have a better idea. On its face, I don't see that the test is enforcing an important property.
/** | ||
* Visible for serialization / deserialization. Do not use this method, but call {@link #create} | ||
* instead. | ||
*/ | ||
public FileContentsProxy(long ctime, long nodeId) { | ||
this.ctime = ctime; | ||
this.mtime = ctime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned this class up in e4cc0b9: would you mind syncing?
That was a while ago... Skimming history I believe I was showing that mtime was significant to equality (the version I introduced was actually asserting not-equal). At some point it got flipped to assert equal when mtime was removed / deemed unnecessary? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM, added a few comments.
// Make sure that the sandbox_root path has no trailing slash. | ||
if (sandbox_root.back() == '/') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when the path has two trailing slashes, e.g. /dev/shm/mysandbox//
? (I'm fine with exiting with an error on invalid input like that, but it looks like this wouldn't catch it at all?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
src/main/tools/linux-sandbox-pid1.cc
Outdated
if (handle < 0) { | ||
DIE("open"); | ||
} | ||
if (close(handle)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add < 0
to the condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
src/main/tools/linux-sandbox-pid1.cc
Outdated
// Create a bind mount to /proc | ||
if (CreateTarget("proc", true) < 0) { | ||
DIE("CreateTarget"); | ||
} | ||
if (mount("/proc", "proc", NULL, MS_REC | MS_BIND, NULL) < 0) { | ||
DIE("mount"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason that we no longer have to mount a new proc on top of the old one, because the old one still refers to our parent PID namespace, but can just use a bind mount in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have a look, this code is taken from old linux-sandbox implementation, but maybe its not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I remember this being tricky, but in the end we found a way that worked. 🤔 A good way to check whether /proc is working correctly is that when you manually run linux-sandbox
e.g. with a shell and you look inside /proc, you shouldn't see any processes from outside the sandbox.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it works
src/main/tools/linux-sandbox-pid1.cc
Outdated
@@ -349,8 +413,13 @@ static void SetupNetworking() { | |||
} | |||
|
|||
static void EnterSandbox() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT, should we rename this to EnterWorkingDirectory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense, done
src/main/tools/linux-sandbox-pid1.cc
Outdated
|
||
static void CreateEmptyFile() { | ||
// This is used as the base for bind mounting. | ||
CreateTarget("tmp", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check the return value of CreateTarget for errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
@philwo I was actually the one who suggested using the methods from blaze_util instead of adding yet another reimplementation. Seems like that's what a util class is for. Is this util particularly heavy? Should it maybe be split? |
@larsrc-google But the one method was already there and worked fine and the other two are single lines of code. I think this is taking "DRY" a bit too far. The sandbox binary is time-tested, performance and correctness critical code, I can't even tell for sure whether this PR will not break something subtle that we will only discover months down the line. This is nothing against this particular code, but more a general cautionary approach I keep around this code - for example, we still found bugs in the signal handling code of the existing code I think just last year, even though it "worked" fine for years already - until it finally didn't under some environmental circumstances. I would really like to keep it as simple as possible so you can read it in one go and reason about it and remove any abstractions or dependencies on "generic" implementations of functions. |
@philwo My original comment (https://github.com/bazelbuild/bazel/pull/13279/files/0a9c8f7dc80255e925189c5ac94099fca5a172f3#r608589967) was not intended to apply to existing code so much as to the new implementations being added. Adding new implementations where time-tested ones exist makes it more likely that something will break. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You seem to have a merge conflict pending.
src/main/java/com/google/devtools/build/lib/actions/FileContentsProxy.java
Show resolved
Hide resolved
Set<Path> writableDirs, | ||
TreeDeleter treeDeleter, | ||
@Nullable Path statisticsPath, | ||
Boolean sandboxDebug) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass in a plain boolean instead of a Boolean object, since you're using it that way anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
hardLinkRecursive(source, target); | ||
} | ||
|
||
private void hardLinkRecursive(Path source, Path target) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add JavaDoc. In particular explain the failure modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/java/com/google/devtools/build/lib/sandbox/HardlinkedSandboxedSpawn.java
Show resolved
Hide resolved
throw new IllegalArgumentException(source + " is a subdirectory of " + target); | ||
} | ||
target.createDirectory(); | ||
Collection<Path> entries = source.getDirectoryEntries(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're only using the basename, you can avoid a bunch of string manipulation by using readdir()
instead of getDirectoryEntries()
, similar to WorkerExecRoot.cleanExisting()
. And if you use that, you could have the directory loop be main part of this function and use the file/symlink/dir info in Dirent
instead of the possibly expensive is*()
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seemed not so trivial, so I decided not to do anything about this.
I will take a look if you are really concerned by the current implementation of this.
" -h if set, create a chroot in the sandbox directory (-s) and only " | ||
" mount whats been specified with -M/-m for improved hermeticity\n" | ||
" -s The sandbox root directory where the chroot will be created, -W" | ||
" should be a folder inside this sandbox root directory\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a check that this holds, probably at the end of ParseOptions? You depend on it further down for a substr
operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added it in ParseCommandLine
} | ||
|
||
// Recursively creates the file or directory specified in "path" and its parent | ||
// directories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add that this sets errno and returns -1 for errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/tools/linux-sandbox-pid1.cc
Outdated
|
||
// Create the parent directory. | ||
if (CreateTarget(dirname(strdupa(path)), true) < 0) { | ||
DIE("CreateTarget"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the name of the directory we failed to create would be helpful here (but easier if CreateTarget dies in all errors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/main/tools/linux-sandbox-pid1.cc
Outdated
// certain filesystems (e.g. XFS). | ||
static void LinkFile(const char *path) { | ||
if (link("tmp/empty_file", path) < 0) { | ||
DIE("link"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the file we were trying to link to in the message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
6673bd6
to
b82342c
Compare
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
80ba322
to
ba775b2
Compare
Adds linux-sandbox flag: --experimental_use_hermetic_linux_sandbox - Configure linux-sandbox to run in a chroot environment to prevent access to files not mentioned in the bazel rules unless they can be found via explicitly whitelisted directories using --sandbox_add_mount_pair create hardlinks instead of symlinks, and fallback to copying. In case of writes to input files, the build will be aborted.
ba775b2
to
e051b6c
Compare
Back from vacation now, I have acted on all your comments @philwo @larsrc-google. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. This looks fine to me.
@larsrc-google WDYT?
I agree. I have imported it and all tests pass. There are a few style things to fix, I'll take care of that. |
if (CreateTarget("dev", true) < 0) { | ||
DIE("CreateTarget /dev"); | ||
} | ||
const char *devs[] = {"/dev/null", "/dev/random", "/dev/urandom", "/dev/zero", NULL}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sandbox would be less hermetic if /dev/shm is bind mounted, since then actions could communicate with each other via /dev/shm.
Would it make sense to require an explicit --sandbox_add_mount_pair for use cases where /dev/shm is desired to be bind mounted?
Summary: This diff updates our Bazel version to 5.0.0. I've re-worked our patches on top of the upstream 5.0.0 tag. - `linux-sandbox` has changed a bunch (mostly in `linux-sandbox-pid1.cc`) so I had to rewrite a bunch of the patch. The main change is that upstream has added a bunch of logic from our own patch in order to support the new hermetic sandbox flag (see bazelbuild/bazel#13279). So I've cleaned things up, removed some of our code and instead called their new code. One change is that they hardlink an empty file outside of the sandbox rather than creating new files, which sounds ok. Note that we might be able to remove even more of our own patch in favor of their hermetic support but we can do that later. - The merkle tree computation moved from `RemoteSpawnCache` to the `RemoteExecutionService` We should be able to rewrite the patch fairly easily but they've also added an (in-process) cache for those trees (see bazelbuild/bazel#13879) so it might be helping with the slowness that we were seeing before. I'm inclined to not apply the patch to start with and we can add it back if things get much slower. The changes are on the `dbx-2022-02-25-5.0.0` branch in the bazel repo. Here's the list of our own commits on top of upstream: - [[ https://sourcegraph.pp.dropbox.com/bazel/-/commit/5a121a34b1a2a39530bf6cecc3892fc4509a1735?visible=2 | DBX: Helper scripts to build dbx bazel ]] - [[ https://sourcegraph.pp.dropbox.com/bazel/-/commit/c3707dea392806b81f2892d46ede5bf54ef02527?visible=1 | DBX: Point remotejdk URLs to magic mirror ]] - [[ https://sourcegraph.pp.dropbox.com/bazel/-/commit/dc5a85b9a1b710230f2c786fd2cede3adb29370d?visible=2 | DBX: Make sure that the java8 toolchain uses the right options ]] - [[ https://sourcegraph.pp.dropbox.com/bazel/-/commit/497532f9878b3b68582c12766bf034a4de6cc44a?visible=6 | DBX: rootfs patch for the linux-sandbox ]] Also see https://blog.bazel.build/2022/01/19/bazel-5.0.html DTOOLS-1748 Test Plan: Will run the main projects and CI and make sure that things still work. Ran `bzl tool //dropbox/devtools/bazel_metrics/benchmarks --target //services/metaserver edit-refresh` on both this diff and master. On 4.1.0 on master: ``` Running no-op reload 5 times... Finished running no-op reload! The results were: min: 3.01s avg: 3.08s p50: 3.08s max: 3.21s Running modify metaserver/static/js/modules/core/uri.ts 5 times... Finished running modify metaserver/static/js/modules/core/uri.ts! The results were: min: 5.30s avg: 5.78s p50: 5.77s max: 6.59s Running modify metaserver/static/css/legacy_browse.scss 5 times... Finished running modify metaserver/static/css/legacy_browse.scss! The results were: min: 4.46s avg: 4.83s p50: 4.69s max: 5.26s Running add file at metaserver/static/js/modules/core/devbox-benchmark-file-{}.ts 5 times... Finished running add file at metaserver/static/js/modules/core/devbox-benchmark-file-{}.ts! The results were: min: 25.69s avg: 26.21s p50: 26.22s max: 26.89s Running modify metaserver/static/error/maintenance.html 5 times... Finished running modify metaserver/static/error/maintenance.html! The results were: min: 4.75s avg: 4.85s p50: 4.75s max: 5.01s ``` On 5.0.0 ``` Running no-op reload 5 times... Finished running no-op reload! The results were: min: 3.48s avg: 3.69s p50: 3.48s max: 3.90s Running modify metaserver/static/js/modules/core/uri.ts 5 times... Finished running modify metaserver/static/js/modules/core/uri.ts! The results were: min: 5.54s avg: 6.34s p50: 5.54s max: 8.59s Running modify metaserver/static/css/legacy_browse.scss 5 times... Finished running modify metaserver/static/css/legacy_browse.scss! The results were: min: 4.34s avg: 4.75s p50: 5.05s max: 5.46s Running add file at metaserver/static/js/modules/core/devbox-benchmark-file-{}.ts 5 times... Finished running add file at metaserver/static/js/modules/core/devbox-benchmark-file-{}.ts! The results were: min: 25.55s avg: 25.96s p50: 25.64s max: 26.71s Running modify metaserver/static/error/maintenance.html 5 times... Finished running modify metaserver/static/error/maintenance.html! The results were: min: 4.79s avg: 5.33s p50: 5.15s max: 5.84s ``` GitOrigin-RevId: 0f466c5a3bde9ed1157ea936bb70826b58f2fbec
Adds linux-sandbox flag:
--experimental_use_hermetic_linux_sandbox - Configure linux-sandbox
to run in a chroot environment to prevent access to files not
mentioned in the bazel rules unless they can be found via
explicitly whitelisted directories using --sandbox_add_mount_pair
create hardlinks instead of symlinks, and fallback to copying.
In case of writes to input files, the build will be aborted.