Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote: Use execRoot as input root and do NOT set working directory by default. #13339

Closed

Conversation

coeuvre
Copy link
Member

@coeuvre coeuvre commented Apr 13, 2021

When --experimental_sibling_external_layout is set, use the parent directory of execRoot as input root and set working directory to the base name of execRoot. Paths of output files are relative to working directory.

When --incompatible_remote_paths_relative_to_input_root is set, paths of output files are relative to input root.

Introduce RemotePathResolver which is used to convert local paths to remote paths and vice versa. We should prefer this class instead of using execRoot directly whenever possible in remote module. execRoot usages in RemoteCache are all replaced.

On Windows, shared action results cache for cl.exe across different workspaces causing header dependency checking to fail. This was initially fixed by #9172, but is broken after cl/356735700. The reason test didn't fail before this change is that two builds from different workspaces do not share the cache since input files are relative to the parent of exec root which contains workspace name. This change fixes that by adding workspace name as platform property for action running on Windows.

Fixes #13188.

@google-cla google-cla bot added the cla: yes label Apr 13, 2021
@coeuvre coeuvre force-pushed the fix-remote-working-directory branch from 7a8c577 to a68223c Compare April 13, 2021 07:57
@coeuvre coeuvre marked this pull request as ready for review April 13, 2021 10:29
@coeuvre coeuvre requested a review from lberki April 13, 2021 10:29
Copy link
Contributor

@lberki lberki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial round. I'm not in very good shape today so I probably don't grok everything :(

// https://github.com/bazelbuild/bazel/issues/9172 for more details), but is broken again due
// to cl/356735700. We include workspace name here so action results from different workspaces
// on Windows won't be shared.
boolean isWindows = System.getProperty("os.name").startsWith("Windows");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the operating system Bazel runs on and that isn't necessarily the same as the remote operating system. In addition, this AFAIU needlessly decreases the cache hit rate for actions that do not invoke MSVC.

hat said, I don't know off the bat how to tell what the remote operating system is. I'm wondering if it's better to fix this by looking at the PARSE_SHOWINCLUDES feature in the C++ compile action. It should be possible to add the necessary property to the Spawn constructed in CppCompileAction.createSpawn(), which has the additional advantage that the hack stays in C++ code which requires it.

Copy link
Member Author

@coeuvre coeuvre Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the operating system Bazel runs on and that isn't necessarily the same as the remote operating system.

This is used for remote cache only (which means local execution). For executing cl.exe remotely, RBE has the hack to remove the workspace name on the server side. And this requires other server implementation to do the same hack if they want to support compiling c++ with cl.exe.

I had a discussion with @meteorcloudy yesterday about a long term fix for this. It seems we can solve the issue at Bazel side, but requires lots of changes. I will share the details later.

In addition, this AFAIU needlessly decreases the cache hit rate for actions that do not invoke MSVC.

Note that, if we use parent of exec root as input root, the paths of input files include workspace name (input file paths are used to compute cache key) which implies the same thing: caches across different workspace won't be shared. So I won't concern too much about this.

I'm wondering if it's better to fix this by looking at the PARSE_SHOWINCLUDES feature in the C++ compile action.

Good point! I will have a look.

boolean isWindows = System.getProperty("os.name").startsWith("Windows");
if (isWindows) {
return ImmutableList
.of(Property.newBuilder().setName("workspace").setValue(execRoot.getBaseName()).build());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that workspace is not interpreted by RBE and is just to differentiate the cache keys? In that case, call it something more descriptive, e.g. BAZEL_WORKSPACE_DISAMBIGUATION_FOR_MSVC_HACK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly!

* {@code --incompatible_remote_output_paths_relative_to_input_root} is not set, otherwise,
* relative to input root.
*/
class SiblingExternalLayoutResolver implements RemotePathResolver {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: SiblingRepositoryLayout for consistency with the name of the flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -61,6 +62,13 @@ static DirectoryTree fromActionInputs(
throws IOException {
Map<PathFragment, DirectoryNode> tree = new HashMap<>();
int numFiles = buildFromActionInputs(inputs, metadataProvider, execRoot, digestUtil, tree);
// // Make sure working directory is exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debugging code left in?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, removed.

@@ -130,16 +131,16 @@ public ActionResult downloadActionResult(
*/
public ActionResult upload(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the lifetime of a RemoteCache the a single command. Then why does one need to pass in a RemotePathResolver as opposed to making it a member of the class? (same for a number of other places below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemotePathResolver is a drop in replacement for the usages of execRoot inside remote module. The question then becomes why did we pass in execRoot instead of making it a member of RemoteCache. Probably due to the time when we construct RemoteCache (inside beforeCommand), execRoot is not available yet. I can change this in another PR but let's make this PR only focus on changes for output paths semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or else execRoot just ended up being there without any rhyme or reason. We'd have to take a look at the change history to tell, but it doesn't really matter, so meh.

I think the question is more important for RemotePathResolver because it's an object with potentially a lot of state, unlike Path, which is just a value object. That said, if you want to clean it up later, all the power to you.

@coeuvre
Copy link
Member Author

coeuvre commented Apr 14, 2021

Note that there is a pending change for REAPI about the semantics of output paths bazelbuild/remote-apis#191. We will update in a separate PR accordingly once it is merged.

@coeuvre coeuvre requested a review from oquenchil as a code owner April 14, 2021 07:45
@coeuvre coeuvre force-pushed the fix-remote-working-directory branch from 1c6fdb8 to 1a3609f Compare April 14, 2021 08:03
sortPlatformProperties(platformBuilder);
return platformBuilder.build();
}

@Nullable
public static Platform getPlatformProto(Spawn spawn, @Nullable RemoteOptions remoteOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: getPlatformProto() only has four call sites, so this overload is probably unnecessary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Changed to use value of DIFFERENTIATE_WORKSPACE_CACHE instead of execRoot.getBaseName().

@@ -130,16 +131,16 @@ public ActionResult downloadActionResult(
*/
public ActionResult upload(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...or else execRoot just ended up being there without any rhyme or reason. We'd have to take a look at the change history to tell, but it doesn't really matter, so meh.

I think the question is more important for RemotePathResolver because it's an object with potentially a lot of state, unlike Path, which is just a value object. That said, if you want to clean it up later, all the power to you.

@@ -764,10 +758,11 @@ private ActionResultMetadata parseActionResultMetadata(

ImmutableMap.Builder<Path, FileMetadata> files = ImmutableMap.builder();
for (OutputFile outputFile : actionResult.getOutputFilesList()) {
Path path = remotePathResolver.resolveLocalPath(outputFile.getPath());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opt: call this localPath maybe? (also above). I don't think it's much more descriptive, but given that RemotePathResolver seems to make the local/remote path distinction important, it looks like an improvement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

static List<Property> getExtraPlatformProperties(Spawn spawn, Path execRoot) {
if (Spawns.shouldDifferentiateWorkspaceCache(spawn)) {
return ImmutableList
.of(Property.newBuilder().setName("bazel-workspace").setValue(execRoot.getBaseName())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: make "bazel-workspace" also allude to the fact that it's only for splitting cache keys

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "bazel-differentiate-workspace-cache".

*
* @param execPath a path fragment relative to {@code execRoot}.
*/
String resolveOutputPath(PathFragment execPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overload seems to be unused

*
* @param outputPath the return value of {@link #resolveOutputPath(PathFragment)}.
*/
Path resolveLocalPath(String outputPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opt: maybe it's better to call these methods localPathToOutputPath or outputPathToLocalPath because the verb "resolve" is quite vague.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

/**
* @return the {@code workingDirectory} for a remote action.
*/
@Nullable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document when this is null

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: in that case, consider making the "working directory == input root" case the empty string. Null kinda implies "not available / there isn't any", which is not what you mean here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// dependency checking to fail. This was initially fixed by a hack (see
// https://github.com/bazelbuild/bazel/issues/9172 for more details), but is broken again due
// to cl/356735700. We require execution service to ignore caches from other workspace.
executionInfo.put(ExecutionRequirements.DIFFERENTIATE_WORKSPACE_CACHE, "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation is much nicer than the previous one, kudos :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! :)

getDotdFile().getExecPathString());
}

if (shouldParseShowIncludes()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not simply booleaan shouldParseShowIncludes = featureConfiguration.isEnabled(CppRuleClasses.PARSE_SHOWINCLUDES?

Every call site of shouldParseShowIncludes() is called from this single method, so it's a slightly better because then you have one less method and a teeny tiny bit less computation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since featureConfiguration is null when running some unit tests. By using shouldParseShowIncludes, we open a chance for tests to mock this method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and by doing so, widen the interface of CppCompileAction. If that's the motivation to extract this into a separate method, let's not do that and provide a FeatureConfiguration to the tests, if need be.

@@ -145,7 +146,7 @@

// The action key of the Spawn returned by newSimpleSpawn().
private final String simpleActionId =
"b9a727771337fd8ce54821f4805e2d451c4739e92fec6f8ecdb18ff9d1983b27";
"eb45b20cc979d504f96b9efc9a08c48103c6f017afa09c0df5c70a5f92a98ea8";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding test cases for the new functionality you added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

@lberki lberki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I assume this is not ready for a second round of review, since the CI is red?)

@coeuvre
Copy link
Member Author

coeuvre commented Apr 16, 2021

(I was interrupted when adding more tests) Now it's ready to be reviewed!

coeuvre added 9 commits April 19, 2021 14:21
…y default.

When --experimental_sibling_external_layout is set, use the parent directory of execRoot as input root and set working directory to the base name of execRoot. Paths of output files are relative to working directory. When --incompatible_remote_paths_relative_to_input_root is set, paths of output files are relative to input root.

Introduce RemotePathResolver which is used to convert local paths to remote paths and vice versa. We should prefer this class instead of using execRoot directly whenever possible. execRoot usages in RemoteCache are all replaced.

Fixes bazelbuild#13188.
…y default.

When --experimental_sibling_external_layout is set, use the parent directory of execRoot as input root and set working directory to the base name of execRoot. Paths of output files are relative to working directory. When --incompatible_remote_paths_relative_to_input_root is set, paths of output files are relative to input root.

Introduce RemotePathResolver which is used to convert local paths to remote paths and vice versa. We should prefer this class instead of using execRoot directly whenever possible. execRoot usages in RemoteCache are all replaced.

On Windows, shared action results cache for cl.exe across different workspaces causing header dependency checking to fail. This was initially fixed by bazelbuild#9172, but is broken after cl/356735700. The reason test didn't fail is two builds from different workspaces do not share the cache since input files are relative to the parent of exec root which contains workspace name. This change fixes that by adding workspace name as platform property for action running on Windows.

Fixes bazelbuild#13188.
@coeuvre coeuvre force-pushed the fix-remote-working-directory branch from f75edcb to c315e01 Compare April 19, 2021 06:22
@bazel-io bazel-io closed this in b863dbf Apr 19, 2021
coeuvre added a commit to coeuvre/bazel that referenced this pull request Jul 15, 2021
…y default.

When --experimental_sibling_external_layout is set, use the parent directory of execRoot as input root and set working directory to the base name of execRoot. Paths of output files are relative to working directory.

When --incompatible_remote_paths_relative_to_input_root is set, paths of output files are relative to input root.

Introduce RemotePathResolver which is used to convert local paths to remote paths and vice versa. We should prefer this class instead of using execRoot directly whenever possible in remote module. execRoot usages in RemoteCache are all replaced.

On Windows, shared action results cache for cl.exe across different workspaces causing header dependency checking to fail. This was initially fixed by bazelbuild#9172, but is broken after bazelbuild@24c980b. The reason test didn't fail before this change is that two builds from different workspaces do not share the cache since input files are relative to the parent of exec root which contains workspace name. This change fixes that by adding workspace name as platform property for action running on Windows.

Fixes bazelbuild#13188.

Closes bazelbuild#13339.

PiperOrigin-RevId: 369168230
coeuvre added a commit to coeuvre/bazel that referenced this pull request Jul 15, 2021
…y default.

When --experimental_sibling_external_layout is set, use the parent directory of execRoot as input root and set working directory to the base name of execRoot. Paths of output files are relative to working directory.

When --incompatible_remote_paths_relative_to_input_root is set, paths of output files are relative to input root.

Introduce RemotePathResolver which is used to convert local paths to remote paths and vice versa. We should prefer this class instead of using execRoot directly whenever possible in remote module. execRoot usages in RemoteCache are all replaced.

On Windows, shared action results cache for cl.exe across different workspaces causing header dependency checking to fail. This was initially fixed by bazelbuild#9172, but is broken after bazelbuild@24c980b. The reason test didn't fail before this change is that two builds from different workspaces do not share the cache since input files are relative to the parent of exec root which contains workspace name. This change fixes that by adding workspace name as platform property for action running on Windows.

Fixes bazelbuild#13188.

Closes bazelbuild#13339.

PiperOrigin-RevId: 369168230
coeuvre added a commit to coeuvre/bazel that referenced this pull request Jul 16, 2021
…y default.

When --experimental_sibling_external_layout is set, use the parent directory of execRoot as input root and set working directory to the base name of execRoot. Paths of output files are relative to working directory.

When --incompatible_remote_paths_relative_to_input_root is set, paths of output files are relative to input root.

Introduce RemotePathResolver which is used to convert local paths to remote paths and vice versa. We should prefer this class instead of using execRoot directly whenever possible in remote module. execRoot usages in RemoteCache are all replaced.

On Windows, shared action results cache for cl.exe across different workspaces causing header dependency checking to fail. This was initially fixed by bazelbuild#9172, but is broken after bazelbuild@24c980b. The reason test didn't fail before this change is that two builds from different workspaces do not share the cache since input files are relative to the parent of exec root which contains workspace name. This change fixes that by adding workspace name as platform property for action running on Windows.

Fixes bazelbuild#13188.

Closes bazelbuild#13339.

PiperOrigin-RevId: 369168230
mou-hao added a commit to mou-hao/bb-remote-execution that referenced this pull request Oct 25, 2021
Bazel sets the "bazel-differentiate-workspace-cache" property when
building CPP on Windows. This is a hack to have different hash keys for
different workspaces (because header files sharing on Windows causing
some problems). It is not intended to mean anything to the remote
executor.

More about the property: bazelbuild/bazel#13339
@EdSchouten
Copy link
Contributor

It’s a shame that this feature was added through a platform property, because it requires remote execution services to jump through extra hoops.

Are you aware of the existence of Action’s ‘salt’ field? That would have allowed you to influence Action digests in a way that’s oblivious to remote execution services.

@EdSchouten EdSchouten reopened this Oct 25, 2021
@coeuvre
Copy link
Member Author

coeuvre commented Oct 25, 2021

Thanks for the reference. I can fix this.

@bazel-io bazel-io closed this in 8c2c78c Nov 17, 2021
coeuvre added a commit to coeuvre/bazel that referenced this pull request Nov 24, 2021
meteorcloudy pushed a commit that referenced this pull request Nov 24, 2021
@alexeagle
Copy link
Contributor

@coeuvre it seems like we're missing the incompatible-change tracking bug to flip the default for incompatible_remote_output_paths_relative_to_input_root to true. @meteorcloudy how do those get created?

@meteorcloudy
Copy link
Member

For incompatible changes, we should still follow https://bazel.build/contribute/breaking-changes

@coeuvre
Copy link
Member Author

coeuvre commented Mar 28, 2022

Created #15131.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants