Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BEP events future is stuck when running with experimental_remote_grpc_log option #13312

Closed
shirchen opened this issue Apr 7, 2021 · 2 comments
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged

Comments

@shirchen
Copy link

shirchen commented Apr 7, 2021

Description of the problem / feature request:

While debugging a timeout for uploading BEP events

[21:11:50] ERROR: Unable to write all BEP events to file due to 'java.io.IOException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999906540s. [remote_addr=buildfarm-internal...]'

I tried running build with experimental_remote_grpc_log option, however, it looks like build event future has gotten stuck and build hangs indefinitely.

"bep-local-writer" #42 daemon prio=5 os_prio=0 cpu=8.90ms elapsed=799.74s tid=0x00007f670437e000 nid=0x27a waiting on condition  [0x00007f68581c4000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
        - parking to wait for  <0x00000001274b3028> (a com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture)
        at java.util.concurrent.locks.LockSupport.park([email protected]/Unknown Source)
        at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:535)
        at com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:88)
        at com.google.devtools.build.lib.buildeventstream.transports.FileTransport$SequentialWriter.run(FileTransport.java:135)

Explicitly setting bes_timeout didn't help and future remains stuck.

Feature requests: what underlying problem are you trying to solve with this feature?

Have bes event upload not get stuck.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Maybe the following options:

[13:24:24]   'build' options: --remote_upload_local_results=true --show_progress_rate_limit=5 --terminal_columns=143 --disk_cache= --remote_max_connections=0 --remote_cache=grpc://buildfarm-internal.... --remote_retries=3 --remote_timeout=10s --remote_local_fallback=true --modify_execution_info=Cpp.*=+no-remote,GoLink.*=+no-remote --experimental_remote_grpc_log=/go-code/build/remote.log

What operating system are you running Bazel on?

Linux

What's the output of bazel info release?

3.7.0

Any other information, logs, or outputs that you want to share?

Attaching jstack and jvm.out
jstack.log
jvm.log

@coeuvre
Copy link
Member

coeuvre commented Apr 9, 2021

This should be fixed by #12416 which is included in 4.0.0.

@coeuvre coeuvre added team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged labels Apr 9, 2021
@coeuvre
Copy link
Member

coeuvre commented May 12, 2021

Closing since this is fixed. Feel free to reopen if you still have this issue with 4.0.0.

@coeuvre coeuvre closed this as completed May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged
Projects
None yet
Development

No branches or pull requests

2 participants