-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace most Bzlmod events with a Skyframe graph lookup #22058
Conversation
@Wyverald There is one remaining test failure since we can't get the values of the extensions that fail due to invalid imports. I see two options here:
What do you think? Do you see an alternative? |
I see! I quite like option 2, actually. It means that even with |
8adb93b
to
5eb3504
Compare
ModuleExtensionResolutionEvent
with a Skyframe graph lookup
.../java/com/google/devtools/build/lib/bazel/bzlmod/SingleExtensionUnvalidatedEvalFunction.java
Outdated
Show resolved
Hide resolved
928c399
to
0f1530f
Compare
57614ed
to
211e10b
Compare
I ended up including the limiting of event propagation in this PR, which means that it should be possible to rely on regular Skyframe events for the download events introduced in #21901. I also added a test for this. @Wyverald PTAL. @justinhorvitz Could you review the (very minor) Skyframe change? I wasn't sure how performance critical |
55504b2
to
a54bf17
Compare
src/main/java/com/google/devtools/build/lib/skyframe/SkyframeExecutor.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/skyframe/SequencedSkyframeExecutor.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/skyframe/SkyframeExecutor.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/BazelFetchAllFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/SingleExtensionValue.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/bazel/bzlmod/SingleExtensionFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/skyframe/BzlmodRepoRuleFunction.java
Outdated
Show resolved
Hide resolved
src/main/java/com/google/devtools/build/lib/skyframe/RepositoryMappingValue.java
Outdated
Show resolved
Hide resolved
private static boolean isBzlmodKey(SkyKey key) { | ||
// The return value of Class#getPackageName() and string literals are always interned. | ||
// https://github.com/openjdk/jdk/blob/3e185c70feef3febf75c58a5d4d394a4b772105f/src/java.base/share/classes/java/lang/Class.java#L1258 | ||
return key.getClass().getPackageName() == "com.google.devtools.build.lib.bazel.bzlmod" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this smells really fishy but I have to admit I don't really understand the shouldPropagate
discussion, so take this with a grain of salt; but having code logic couple so closely with code organization makes me very uncomfortable.
To confirm, is this needed for this PR? I was under the impression that the shouldPropagate
stuff was related to the events generated during registry access that are sent to the Skyframe handlers. Do we also need it for some other purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR doesn't touch BazelModuleResolutionEvent
, the change to shouldPropagate
is needed to address the memory regression that retaining Bzlmod events caused (every Skyframe node comes with a ValueWithMetadata
wrapper). The future PRs will get rid of BazelModuleResolutionEvent
, but for that they need to introduce the download events, so that by itself wouldn't improve the situation.
I agree that the idea of checking the package is fishy, but I found it to be the least bad option I could come up with, others being:
- Enumerate all Bzlmod
SkyKey
classes in someSet
, but that would make membership testing slower and also require passing this Bazel/Bzlmod-specific logic into the executor. - Have Bzlmod
SkyKey
s implement a common interface, but pre-JDK 23 that could have surprising performance implications and we would still need to get this Bazel-specific type intoSkyframeExecutor
. - Do not use retained events in Bzlmod at all. We can get rid of
BazelModuleResolutionEvent
and try to build our own event mechanism for registry downloads. However, since we need to make sure that the events are correctly replayed on no-op builds with the lockfile removed, we would probably need to include them into theSkyValue
s as in my original approach for Store remote registry file hashes in the lockfile #21901. This would make it harder to decouple the registry download logic from the SkyFunctions using it.
No matter how brittle this looks, note that the relation defined by this check is lower and upper bounded by tests: If it matches too little or too many nodes, BazelModuleResolutionTest
will fail due to analysis nodes having Bzlmod events attached or the main repo mapping node not having Bzlmod events attached.
I also don't love this approach, so I'm happy to change it out for anything else you prefer.
Edit: 2a) Have Bzlmod SkyKey
s implement a common subclass could work without the risk of negative performance implications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer most of the alternatives to the package string check. A Set<SkyFunctionName>
, marker interface, default boolean
method on SkyKey
. Final decision to @Wyverald since he has more stake in bzlmod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change to
shouldPropagate
is needed to address the memory regression that retaining Bzlmod events caused (every Skyframe node comes with aValueWithMetadata
wrapper).
Is that regression introduced in this very PR, or some existing regression? If the latter, I'd say it's not so urgent as to require fixing right away. We could address it in a separate change, and have this PR focus solely on replacing (some) events with graph lookups.
3. Do not use retained events in Bzlmod at all. We can get rid of
BazelModuleResolutionEvent
and try to build our own event mechanism for registry downloads. However, since we need to make sure that the events are correctly replayed on no-op builds with the lockfile removed, we would probably need to include them into theSkyValue
s as in my original approach for Store remote registry file hashes in the lockfile #21901.
I'm still in favor of this approach. It's minimally intrusive, and easier to follow. Skyframe is a big scary beast and the fact that we're just now discovering some big memory regression is testament to that; so I'd rather we avoid using event propagation if we don't need to. Key part "if we don't need to": to me it feels like propagating events across nodes is definitely valuable in some cases, but for just recording registry access it feels overkill.
Simple proposal: add subscribe
and unsubscribe
methods to Registry
. Before accessing the registry, subscribe; during, record accesses; after (and in finally
), unsubscribe. All this can be done within one SkyFunction.
This would make it harder to decouple the registry download logic from the SkyFunctions using it.
I don't quite understand this part: what "coupling" are you referring to? And anyway, pretty much only two SkyFunctions need access to the registry -- ModuleFileFunction and RepoSpecFunction; it's not like it's a widely applicable concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the regression isn't new, I removed the Skyframe changes from this PR. @justinhorvitz You may still need to approve this to reset the "requesting changes" bit.
Skyframe is a big scary beast and the fact that we're just now discovering some big memory regression is testament to that; so I'd rather we avoid using event propagation if we don't need to.
I agree that Skyframe is scary, but at least it guarantees incremental correctness. I caused memory regressions in Bazel or Blaze in many other ways. Those in Blaze were all caught quickly thanks to the benchmarks - are there or could there be similar benchmarks for Bazel?
Simple proposal: add subscribe and unsubscribe methods to Registry. Before accessing the registry, subscribe; during, record accesses; after (and in finally), unsubscribe. All this can be done within one SkyFunction.
I'm not worried about how to get the information from the Registry
into the SkyFunction
using it, I'm more worried about us messing up incremental correctness of how this information then gets to the lockfile. But I agree that we can likely get this right:
- Add the collected hashes to
RepoSpecValue
(need to bring this back) andModuleFileValue
. - Traverse the Skyframe graph down from
RepositoryMappingValue.key(RepositoryMapping.MAIN)
and collect the hashes from these values.
I think that we should walk edges instead of just iterating through all nodes for this as we don't want to record hashes of unused modules in the lockfile as this could increase the chance of merge conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Traverse the Skyframe graph down from
RepositoryMappingValue.key(RepositoryMapping.MAIN)
and collect the hashes from these values.
Or even just aggregate all the collected hashes in BazelDepGraphValue
? Saves us the trouble of even having to perform graph lookups at all.
59b5bcf
to
ace82a1
Compare
Actually, could you address the test failure? Looks legit. |
I pushed a fix and added a test I forgot to add earlier, PTAL. |
The CI failure is |
@bazel-io fork 7.2.0 |
@fmeum can you please add a Javadoc comment for |
@iancha1992 I pushed a new commit with the Javadoc. |
Retaining `Postable`s across the graph increases the memory usage of every dependent Skyframe node. `ModuleExtensionResolutionEvent` is replaced with a lookup of the "done" Skyframe nodes for extension evaluation in `BazelLockFileModule#afterCommand`. `RootModuleFileFixupEvent` is replaced with direct storage in `SingleExtensionEvalValue`. The validation of repos imported from an extension is moved into a separate `SkyFunction`. This simplifies `bazel mod tidy`, which no longer needs to swallow certain exceptions, and allows cache hits when only the (invalid) imports of an extension are modified even without the lockfile. The `BazelModuleResolutionEvent` is kept for now, but will become obsolete with the new lockfile format. Work towards bazelbuild#20369 Closes bazelbuild#22058. PiperOrigin-RevId: 628213907 Change-Id: I8ba19f5151a8183be5051c8d9280f93476db2272
Retaining `Postable`s across the graph increases the memory usage of every dependent Skyframe node. `ModuleExtensionResolutionEvent` is replaced with a lookup of the "done" Skyframe nodes for extension evaluation in `BazelLockFileModule#afterCommand`. `RootModuleFileFixupEvent` is replaced with direct storage in `SingleExtensionEvalValue`. The validation of repos imported from an extension is moved into a separate `SkyFunction`. This simplifies `bazel mod tidy`, which no longer needs to swallow certain exceptions, and allows cache hits when only the (invalid) imports of an extension are modified even without the lockfile. The `BazelModuleResolutionEvent` is kept for now, but will become obsolete with the new lockfile format. Work towards bazelbuild#20369 Closes bazelbuild#22058. PiperOrigin-RevId: 628213907 Change-Id: I8ba19f5151a8183be5051c8d9280f93476db2272
Retaining `Postable`s across the graph increases the memory usage of every dependent Skyframe node. `ModuleExtensionResolutionEvent` is replaced with a lookup of the "done" Skyframe nodes for extension evaluation in `BazelLockFileModule#afterCommand`. `RootModuleFileFixupEvent` is replaced with direct storage in `SingleExtensionEvalValue`. The validation of repos imported from an extension is moved into a separate `SkyFunction`. This simplifies `bazel mod tidy`, which no longer needs to swallow certain exceptions, and allows cache hits when only the (invalid) imports of an extension are modified even without the lockfile. The `BazelModuleResolutionEvent` is kept for now, but will become obsolete with the new lockfile format. Work towards #20369 Closes #22058. PiperOrigin-RevId: 628213907 Change-Id: I8ba19f5151a8183be5051c8d9280f93476db2272
Retaining `Postable`s across the graph increases the memory usage of every dependent Skyframe node. `ModuleExtensionResolutionEvent` is replaced with a lookup of the "done" Skyframe nodes for extension evaluation in `BazelLockFileModule#afterCommand`. `RootModuleFileFixupEvent` is replaced with direct storage in `SingleExtensionEvalValue`. The validation of repos imported from an extension is moved into a separate `SkyFunction`. This simplifies `bazel mod tidy`, which no longer needs to swallow certain exceptions, and allows cache hits when only the (invalid) imports of an extension are modified even without the lockfile. The `BazelModuleResolutionEvent` is kept for now, but will become obsolete with the new lockfile format. Work towards #20369 Closes #22058. PiperOrigin-RevId: 628213907 Change-Id: I8ba19f5151a8183be5051c8d9280f93476db2272 --------- Co-authored-by: Fabian Meumertzheim <[email protected]>
Retaining `Postable`s across the graph increases the memory usage of every dependent Skyframe node. `ModuleExtensionResolutionEvent` is replaced with a lookup of the "done" Skyframe nodes for extension evaluation in `BazelLockFileModule#afterCommand`. `RootModuleFileFixupEvent` is replaced with direct storage in `SingleExtensionEvalValue`. The validation of repos imported from an extension is moved into a separate `SkyFunction`. This simplifies `bazel mod tidy`, which no longer needs to swallow certain exceptions, and allows cache hits when only the (invalid) imports of an extension are modified even without the lockfile. The `BazelModuleResolutionEvent` is kept for now, but will become obsolete with the new lockfile format. Work towards bazelbuild#20369 Closes bazelbuild#22058. PiperOrigin-RevId: 628213907 Change-Id: I8ba19f5151a8183be5051c8d9280f93476db2272
Retaining
Postable
s across the graph increases the memory usage of every dependent Skyframe node.ModuleExtensionResolutionEvent
is replaced with a lookup of the "done" Skyframe nodes for extension evaluation inBazelLockFileModule#afterCommand
.RootModuleFileFixupEvent
is replaced with direct storage inSingleExtensionEvalValue
. The validation of repos imported from an extension is moved into a separateSkyFunction
. This simplifiesbazel mod tidy
, which no longer needs to swallow certain exceptions, and allows cache hits when only the (invalid) imports of an extension are modified even without the lockfile.The
BazelModuleResolutionEvent
is kept for now, but will become obsolete with the new lockfile format.Work towards #20369