Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass toolchain info to the cabal_wrapper at the call site #1493

Merged
merged 7 commits into from
Feb 18, 2021

Conversation

facundominguez
Copy link
Member

@facundominguez facundominguez commented Feb 12, 2021

This should simplify configuration for cross-compilation as bazel doesn't try anymore to provide a haskell toolchain to build cabal_wrapper "for" the execution platform.

Depends on #1491

@aherrmann
Copy link
Member

@siggisim We're occasionally encountering build event log upload timeouts of the form:

ERROR: The Build Event Protocol upload failed: All retry attempts failed. DEADLINE_EXCEEDED: DEADLINE_EXCEEDED: deadline exceeded after 14999958197ns DEADLINE_EXCEEDED: DEADLINE_EXCEEDED: deadline exceeded after 14999958197ns

As on this PR. These end up failing the build. A rerun typically fixes those.

Have you seen these types of errors before? I'm wondering if there's something we can do in our configuration to extend the timeout or otherwise avoid these failures?

Copy link
Member

@aherrmann aherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks great.

One comment about runfiles.

Comment on lines 74 to 76
path = r.Rlocation(toolchain_info["workspace"] + "/" + exe)
if not os.path.isfile(path) and is_windows:
path = r.Rlocation(toolchain_info["workspace"] + "/" + exe + ".exe")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.Rlocation is for discovering runfiles. However, the tools are now no longer runfiles of the cabal wrapper, but instead parameters and build inputs during the cabal build actions. I.e. the paths provided in the parameters should valid relative to the execroot as is. We'll need to make them absolute, since cabal wrapper later changes directory into the directory containing the cabal file.

It's possible that not all of the required hs and cc toolchain components are inputs to the cabal build action yet. You can use bazel aquery on some cabal target and inspect the inputs to the HaskellLibrary|Binary action.

@aherrmann
Copy link
Member

@facundominguez Unrelated to the changes themselves: We have dpulls configured on rules_haskell. You can use it to stack PRs. E.g. this one depends on #1491, so you can add a comment Depends on #1491 to the PR description and change the base branch to fd/json-cabal-wrapper to stack it. This will hide commits from #1491 on this PR, i.e. make review easier, and dpulls will automatically rebase and change base branch once #1491 is merged.

@facundominguez facundominguez changed the base branch from master to fd/json-cabal-wrapper February 12, 2021 16:27
@siggisim
Copy link

@siggisim We're occasionally encountering build event log upload timeouts of the form:

ERROR: The Build Event Protocol upload failed: All retry attempts failed. DEADLINE_EXCEEDED: DEADLINE_EXCEEDED: deadline exceeded after 14999958197ns DEADLINE_EXCEEDED: DEADLINE_EXCEEDED: deadline exceeded after 14999958197ns

As on this PR. These end up failing the build. A rerun typically fixes those.

Have you seen these types of errors before? I'm wondering if there's something we can do in our configuration to extend the timeout or otherwise avoid these failures?

The interesting thing here is that the invocation completes on our end in 3.41 minutes:
https://app.buildbuddy.io/invocation/b3c03544-66de-4797-ade5-a96a8075147f

But bazel seems to hang for over an hour on the github runner.

This reminds me of this bazel gRPC hang issue that would cause builds to hang indefinitely and ultimately get killed:
bazelbuild/bazel#11782

There were a lot of fixes for this that went in to bazel 4.0.0 and a couple more that went in after. I'm not 100% sure that would fix the issue, but an upgrade to 4.0.0 might be worth it if it's not too painful.

The only 15s timeout I've been able to find in bazel or our code is this one, but not sure that's what we're hitting here:
https://docs.bazel.build/versions/master/command-line-reference.html#flag--local_termination_grace_seconds

If the bazel upgrade doesn't fix the issue or is infeasible, let me know and I can dig in further.

@aherrmann
Copy link
Member

There were a lot of fixes for this that went in to bazel 4.0.0 and a couple more that went in after. I'm not 100% sure that would fix the issue, but an upgrade to 4.0.0 might be worth it if it's not too painful.

@siggisim Thank you for looking into this! That's a good point. I'll let you know if the issue still occurs on 4.0.0.

Base automatically changed from fd/json-cabal-wrapper to master February 12, 2021 18:58
@dpulls
Copy link

dpulls bot commented Feb 12, 2021

🎉 All dependencies have been resolved !

@facundominguez facundominguez force-pushed the fd/runtime-toolchain-info branch 17 times, most recently from 13779d6 to 994894f Compare February 16, 2021 17:47
@facundominguez facundominguez force-pushed the fd/runtime-toolchain-info branch 17 times, most recently from f592beb to 664f4f7 Compare February 17, 2021 19:36
@facundominguez facundominguez force-pushed the fd/runtime-toolchain-info branch from cdf4fe8 to cad9476 Compare February 17, 2021 23:59
Copy link
Member

@aherrmann aherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pushing this through!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants