-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[7.4.1] Files created on disk from remote cache or build are marked dirty on first rebuild #24763
Comments
Did you enable local action cache ( Can you share a json trace profile for the slow first incremental build? |
I created https://bazel-review.googlesource.com/c/bazel/+/268970 to not mark skyframe nodes as dirty. But preliminary benchmark shows the performance is somewhat worse because we need to digest the outputs during invalidation. Can you try whether that patch helps? |
--use_action_cache is on by default right? I explicitly turned it on just in case. Attached two rebuild profile examples. This is the slowest example where we don't use fusion and don't skip creating intermediate archives (.a files), causing the final output binary to be ~8GB. It seems your patch didn't help very much in this case. This is a more normal developer example where we are using fusion and don't combine the debug symbols together so the final binary is ~500MB. It seems your patch vastly helped in this case. |
It is enabled by default. Thanks for the profiles! They really helped me a lot to understand the problem.
In this case, the size and number of output files are large, checking action cache is a lot slower than comparing skyframe nodes.
Right, because my patch needs to digest the outputs which has smaller cost comparing to checking action cache, but it's still a lot. Let me try another idea to further optimize this. |
@Bradshawz: I have updated the patch to not compute digest, but use contents proxy to compare local and remote metadata. Please try it again: https://bazel-review.googlesource.com/c/bazel/+/268970. |
Don't know if it matters that i applied this on 7.4.1, but i'm getting similar results and similar looking profiles to completely unpatched build. |
Maybe most outputs in the build graph are tree outputs? I think the patch doesn't handle tree outputs efficiently. |
Bleh that looks like that was on me, didn't end up removing:
So it was skipping the new code. I'll edit this comment with the new numbers when i get them. Edit: |
Cool! It's good to hear the patch fixed it.
I will continue polishing the patch to handle tree outputs before submiting it. I will also backport it to 7.4 branch afterwards. |
@bazel-io fork 7.5.0 |
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. #24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
@bazel-io fork 8.1.0 |
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. bazelbuild#24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. bazelbuild#24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. bazelbuild#24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. bazelbuild#24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
…ck (#24941) ... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. #24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163 Fixes #24920.
A fix for this issue has been included in Bazel 7.5.0 RC2. Please test out the release candidate and report any issues as soon as possible. |
Tried it out, works great. Thanks! |
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. bazelbuild#24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. bazelbuild#24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163
…ck (#25126) ... if the corresponding output is materialized afterwards and hasn't been changed. Failing to do so will cause an incremental build to re-check the action cache for actions whose outputs were materialized during last build. This can be very slow if the size of the outputs are large. We achieved this by storing `FileContentsProxy` in the remote metadata after materialization. During dirtiness check, we compare remote metadata and the corresponding local metadata with their `digest`, or `proxy` if `digest` is not available for local metadata, A user reported back the wall time of their very first increment build is improved by this change from `14s` to less than `1s` in common case, and from `115s` to less than `1s` in worst case. #24763 PiperOrigin-RevId: 715734718 Change-Id: Id1a1a59d8b5f3e91a7ae05a73663ff37eee6d163 Fixes #24940.
Description of the bug:
We currently use BwoB toplevel with remote build/remote cache, but do our linking locally using build --strategy=CppLink=local. Due to this, we get a large amount of files in our output directory. On the first rebuild of any build, Bazel is marking all the downloaded files as changed outputs. This does not happen on the second rebuild. This can cause the rebuild to take a very long time.
I tried to track down where the issue was coming from a bit, and it seems like:
bazel/src/main/java/com/google/devtools/build/lib/skyframe/FilesystemValueChecker.java
Line 515 in 9ba2ab4
on first rebuild fileMetadata.couldBeModifiedSince(lastKnownData) does not seem like it actually successfully compares anything. It seems like it tries to compare their FileArtifactValue digests, and if that fails it compares their FileContentsProxy. On the first rebuild, fileMetadata has a proxy but no digest, but lastKnownData has a digest but no proxy. On the second rebuild fileMetadata still does not have a digest, but lastKnownData has a proxy and thus isn't marked as modified.
Which category does this issue belong to?
Performance, Remote Execution
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Using a build that either gets files built remotely or from a remote cache and then downloaded locally:
bazel clean
bazel build //src:someTarget
bazel build //src:someTarget - in the java log, there should be something like
241219 17:47:12.558:I 42 [com.google.devtools.build.lib.skyframe.SequencedSkyframeExecutor.detectModifiedOutputFiles] Found 1 modified files from last build
bazel build //src:someTarget -
241219 18:08:46.021:I 42 [com.google.devtools.build.lib.skyframe.SequencedSkyframeExecutor.detectModifiedOutputFiles] Found 0 modified files from last build
Which operating system are you running Bazel on?
Ubuntu 22.04
What is the output of
bazel info release
?release 7.4.1
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse HEAD
?No response
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
I found #22367, but it seems like the opposite issue where stuff that wasn't downloaded was still marked as dirty.
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: