Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: failed to delete output files before executing action (.dll) #10363

Open
ozio85 opened this issue Dec 4, 2019 · 10 comments
Open

Windows: failed to delete output files before executing action (.dll) #10363

ozio85 opened this issue Dec 4, 2019 · 10 comments
Labels
area-Windows Windows-specific issues and feature requests P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug

Comments

@ozio85
Copy link

ozio85 commented Dec 4, 2019

Description of the problem / feature request:

When compiling on Windows, sometimes .dll cannot be deleted.
The typical error message is: " xx.dll: failed to delete output files before executing action"

I am using remote_cache with remote_download_minimal, and it seems those items that actually are left in the local cache, sometimes cannot be deleted.

Manually you can delete the .dlls just fine, so a workaround for now is to search the local cache for .dlls and delete them before server side compilation.

I have previously seen problems when defender or anti-virus is temporarily locking files. However this failure is permanent: Bazel cannot delete the files, the user can.

Feature requests: what underlying problem are you trying to solve with this feature?

  1. Get a better error message when this error occurs (permission, read error etc).
  2. Create a more stable way to delete files on windows (retry?, unlocking?)

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

It is sporadic, but probably connected to remote_cache

What operating system are you running Bazel on?

Windows 10

What's the output of bazel info release?

1.2.0

Have you found anything relevant by searching the web?

Nothing

Any other information, logs, or outputs that you want to share?

No.

@dslomov dslomov added area-Windows Windows-specific issues and feature requests untriaged labels Dec 4, 2019
@laszlocsomor
Copy link
Contributor

We saw such bugs before; we refined the file deletion logic to deal with ever more situations, alas it's still incomplete apparently.

Do you know anything else about those files? Are they marked read-only?

@buchgr -- do you think those files might be held open?

@ozio85
Copy link
Author

ozio85 commented Feb 7, 2020

Well, on the server this is quite frequent. It affects all binaries (.exe and .dll), and my best guess is that it is connected to anti-virus or Windows defender is scanning the file, and therefore locking it.

When the server is entered, there is no problem to remove the binary, and nothing that is locking it.

@philwo philwo added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 15, 2020
@ozio85
Copy link
Author

ozio85 commented Oct 25, 2020

I have some more info, most likely it is Windows Defender that locks the files, and the locking can last from minutes up to 30+ min. (so the unlocking period is not permanent, it is only very long)

Only specific files are affected, so far: .dll, .exe, .bin, .a and maybe .exp

Most of the time the locking period is short, and then it only affects repeated jobs on the server side.
However, when the locking period is long, also local users are affected.

"Unlocker" can unlock the files in all cases, so the current solution is to walk through the output tree, and delete any locked binaries before the build (which takes quite some time on Windows.)

@philwo philwo added P2 We'll consider working on this in future. (Assignee optional) type: bug and removed untriaged labels Nov 26, 2020
@hfeky
Copy link

hfeky commented Jun 9, 2021

Is there any update on this issue? I'm still facing the same issue on Bazel 4.1.0 on Windows 10.

@Zemnmez
Copy link

Zemnmez commented Aug 1, 2021

Having this issue sometimes. It'd be great if it printed out the path for the file so I could use some other way to delete it.

edit: I found that bazel clean will list the file, but I'll have trouble deleting it normally even with sudo. doing sudo rm -rf /home/$USER/.cache/bazel seems to delete them on wsl, but also nukes bazel completely, requiring it to rebuild and reacquire everything.

@bdleitner
Copy link

I'm seeing this a lot when building android targets with bazel mobile-install. It's incredibly irritating because it doesn't tell me which files to remove and in order to continue I have to bazel clean, which then forces me to rebuild everything (and re-sync IntelliJ) even though I suspect only some outputs related to the android_binary are the problem. It would be nice if there was a way to bazel clean a single target.

@fmeum
Copy link
Collaborator

fmeum commented Jan 24, 2022

I just encountered this bug on a Windows machine in GitHub Actions while building a jar file that does not contain any native code.

@fmeum
Copy link
Collaborator

fmeum commented Apr 21, 2023

I debugged another instance in which this happend and traced it back to being caused by an attempted rebuild of the rules_kotlin worker while it was still running due to a previous build request.

@larsrc-google Is the case of using non-prebuilt workers fully supported? It looks like all worker processes would need to be shut down before the worker dependencies can be rebuilt - at least on Windows, where deleting a file that is held open by a process isn't possible.

@larsrc-google
Copy link
Contributor

That does make sense. That is a Windows-specific problem, and I don't have a good immediate solution.

@fmeum
Copy link
Collaborator

fmeum commented Apr 21, 2023

@larsrc-google A workaround that shouldn't be too costly could be to restart all workers on Windows when the analysis cache is discarded. What do you think?

copybara-service bot pushed a commit that referenced this issue Apr 23, 2023
`SkyframeActionExecutor#toActionExecutionException` claimed to combine the user-provided message and the exception's message when reporting an error, but did not.

This is fixed so that errors can be diagnosed directly from the build logs, without having to look into `java.log`.

Work towards #10363

Closes #18169.

PiperOrigin-RevId: 526195991
Change-Id: I978a6d739c37384121acccccf95e8dcb80ac5d25
meteorcloudy pushed a commit that referenced this issue Apr 24, 2023
`SkyframeActionExecutor#toActionExecutionException` claimed to combine the user-provided message and the exception's message when reporting an error, but did not.

This is fixed so that errors can be diagnosed directly from the build logs, without having to look into `java.log`.

Work towards #10363

Closes #18169.

PiperOrigin-RevId: 526195991
Change-Id: I978a6d739c37384121acccccf95e8dcb80ac5d25

Co-authored-by: Fabian Meumertzheim <[email protected]>
fweikert pushed a commit to fweikert/bazel that referenced this issue May 25, 2023
`SkyframeActionExecutor#toActionExecutionException` claimed to combine the user-provided message and the exception's message when reporting an error, but did not.

This is fixed so that errors can be diagnosed directly from the build logs, without having to look into `java.log`.

Work towards bazelbuild#10363

Closes bazelbuild#18169.

PiperOrigin-RevId: 526195991
Change-Id: I978a6d739c37384121acccccf95e8dcb80ac5d25
jasonschroeder-sfdc added a commit to jasonschroeder-sfdc/bazel-buildfarm that referenced this issue Oct 26, 2023
Trying to get more info on the Lombok stamping issue on Windows CI.
See also bazelbuild/bazel#10363 and
bazelbuild/bazel#18185
luxe pushed a commit to buildfarm/buildfarm that referenced this issue Oct 29, 2023
Trying to get more info on the Lombok stamping issue on Windows CI.
See also bazelbuild/bazel#10363 and
bazelbuild/bazel#18185
amishra-u pushed a commit to amishra-u/bazel-buildfarm that referenced this issue Oct 30, 2023
Trying to get more info on the Lombok stamping issue on Windows CI.
See also bazelbuild/bazel#10363 and
bazelbuild/bazel#18185

Rename instance types (buildfarm#1514)

feat: Implement CAS lease extension

Run formatter

Remove ExecutorPool, Instead make fire and forget call to worker:fmb
amishra-u pushed a commit to amishra-u/bazel-buildfarm that referenced this issue Nov 8, 2023
Don't get transitive grpc dependencies, use the ones from our `maven_install(...)`

chore(deps): bump protobuf runtime to 3.19.1

chore(deps) add transitive dependencies

feat: add Proto reflection service to shard worker

To aid connection troubleshooting

Bug: Fix Blocked thread in WriteStreamObserver Caused by CASFile Write (buildfarm#1486)

* Add unit test
* Signal Write on complete

Pin the Java toolchain to `remotejdk_17` (buildfarm#1509)

Closes buildfarm#1508

Cleanups:
- remove the unused `ubuntu-bionic` base image
- replace `ubuntu-jammy:jammy-java11-gcc` with `ubuntu-mantic:mantic-java17-gcc`
- replace `amazoncorretto:19` with `ubuntu-mantic:mantic-java17-gcc`
- swap inverted log file names in a file

docs: add markdown language specifiers for code blocks

Support OutputPaths in OutputDirectory

Specifying any number of OutputPaths will ignore OutputFiles (consistent
with uploads). Where an OutputPath specifies an output directory, the
action must be able to create the directory itself.

Permit Absolute Symlink Targets with configuration

Partial specification of the absolute symlink response per REAPI.
Remaining work will be in output identification.

chore: update bazel to 6.4.0 (buildfarm#1513)

Trying to get more info on the Lombok stamping issue on Windows CI.
See also bazelbuild/bazel#10363 and
bazelbuild/bazel#18185

Rename instance types (buildfarm#1514)

Create SymlinkNode outputs during upload (buildfarm#1515)

Default disabled, available with createSymlinkOutputs option in Worker
config.

feat: Implement CAS lease extension (buildfarm#1455)

Problem

    Enabling the findMissingBlobsViaBackplane flag in BuildfarmServer eliminates the need for the BuildfarmWorker's fmb API call. This BuildfarmWorker:fmb call was also responsible for tracking CAS entry access. As result, our CAS cache eviction strategy shifted from LRU to FIFO.
    When the findMissingBlobsViaBackplane flag is enabled, the buildfarm relies on the backplane as the definitive source for CAS availability. Since we don't update CAS expiry on each access, the backplane will independently expire CAS entries based on the specified cas_expire duration, even if they are actively being read.

Solution

Updated bfServer:fmb call to perform non-blocking fmb calls to workers, allowing these workers to record access for the relevant CAS entries.

Extended expiry duration for available CAS entries in the backplane on each fmb call.

With these changes, we can utilize Bazel's experimental_remote_cache_lease_extension and experimental_remote_cache_ttl flags for incremental builds.

Closes buildfarm#1428

Bump org.json:json from 20230227 to 20231013 in /admin/main (buildfarm#1516)

Bumps [org.json:json](https://github.com/douglascrockford/JSON-java) from 20230227 to 20231013.
- [Release notes](https://github.com/douglascrockford/JSON-java/releases)
- [Changelog](https://github.com/stleary/JSON-java/blob/master/docs/RELEASES.md)
- [Commits](https://github.com/douglascrockford/JSON-java/commits)

---
updated-dependencies:
- dependency-name: org.json:json
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Re-add missing graceful shutdown functionality (buildfarm#1520)

Technically correct to unwrap EE on lock failure

Bump rules_oss_audit and patch for py3.11

Prevent healthStatusManager NPE on start failure

Consistent check for publicName presence

Read through external with query THROUGH=true

Specifying a correlated invocation id with a uri containing a
THROUGH=true query param will cause the CFC to read a blob through an
external input stream, populating locally along the way. This permits
client-based replication of blobs, and can enable N+1 replication and
traffic balancing for reads.

Add --port option to worker

Option to run the worker with a cmdline specification for its gRPC
server port.

Restore worker --root cmdline specification

Root cmdline specification has been broken since the config change of
v2.

Make bf-executor small blob names consistent

Remove the size identification for small blobs when uploading with
bf-executor.

feat: Hot CAS Entries - Update read counts in Redis
amishra-u added a commit to amishra-u/bazel-buildfarm that referenced this issue Nov 23, 2023
author Anshuman Mishra <[email protected]> 1696277984 -0700
committer Anshuman Mishra <[email protected]> 1700707781 -0800

parent 90439ca
author Anshuman Mishra <[email protected]> 1696277984 -0700
committer Anshuman Mishra <[email protected]> 1700707757 -0800

parent 90439ca
author Anshuman Mishra <[email protected]> 1696277984 -0700
committer Anshuman Mishra <[email protected]> 1700707511 -0800

feat: Hot CAS Entries - Implement CAS access metrics recorder

Log on write errors

Use integer ids for Sqlite bidirectional index

The cost in size for a single table bidirectional index is vast compared
to the use of 3nf integer keys. Experimental estimates offer a decrease
in file size of 90%.

Update graceful shutdown functionality to better handle worker terminations (buildfarm#1462)

Manipulate worker set directly in RSB

Avoid dependency on subscriber to update state changes when removing
workers. This prevents an NPE which will occur invariably when workers
are allowably configured with subscribeToBackplane: false.

Remove publishTtlMetric option

The individual metric controls for ttl are not necessary either for
performance or feature support. Use Files' attributes acquisition
mechanism for modified time.

Config-compatible behavior for publishTtlMetric

Correct logging advisements for current Java

Java logging definitions must now match java.util.logging.config.file,
update these specifications in our README.md

Rename GracefulShutdownTest

Remove WebController

Interrupt+Join operationQueuer/dispatchMonitor

Use interrupt to halt the OperationQueuer.
Join on both operationQueuer and dispatchMonitor before instance stop
return.

Present operationNames by stage

Include Match and Report Result stages in output
Record the active operationName occupying slots in each of the stages
and present them with WorkerProfile
Avoid several unnecessary casts with interfaces for operation slot
stages.

Remove subscribeToBackplane, adjust failsafe op

A shard server is impractical without operation subscription, partition
subscription confirmation between servers and workers.
The failsafe execution is configuration that is likely not desired on
workers. This change removes the failsafe behavior from workers via
backplane config, and relegates the setting of failsafe boolean to
server config. If the option is restored for workers, it can be added to
worker configs so that configs may continue to be shared between workers
and servers and retain independent addressability.

Removing AWS/GCP Metrics and Admin controls

Internally driven metrics and scaling controls have low, if any, usage
rates. Prometheus has largely succeeded independent publication of
metrics, and externally driven scaling is the norm. These modules have
been incomplete between cloud providers, and for the functional side of
AWS, bind us to springboot. Removing them for the sake of reduced
dependencies and complexity.

Remove unused setOnCancelHandler

Remove this unused OperationQueue feature which provides no invocations
on any use.

Update BWoB docs for ensureOutputsPresent

Improve unit test

Disable Bzlmod explicitly in .bazelrc

Log write errors with worker address

Revert "Use integer ids for Sqlite bidirectional index"

This reverts commit f651cdb.

Common String.format for PipelineStage

Cleanup matched logic in SWC listener

Continue the loop while we have *not* matched successfully and avoid a
confusing inversion in getMatched()

Refactor SWC matcher and clarify Nullable

Distinguish the valid/unique/propagating methods of entry listening.

Interrupt matchStage to induce prepare shutdown

The only signal to a waiting match that will halt its current listen
loop for a valid unique operation is an interrupt.

Specify example config with grpc target

Distinguish target param with GRPC type storage from FILESYSTEM
definition

Remove SpringBoot usage

Reinstate prior usage of LoggingMain for safe shutdown, with added
release mechanism for interrupted processes. All invoked shutdowns are
graceful, with vastly improved shutdown speed for empty workers waiting
for pipeline stages.

Enable graceful shutdown for server (buildfarm#1490)

refactor: code cleanup

Tiny code cleanup

Log paths created on putDirectory

Will include operation root and inform directory cache effectiveness.

Permit regex realInputDirectories

Selecting realInputDirectories by regex permits flexible patterns that
can yield drastic improvements in directory reuse for specialized
deployments. runfiles in particular are hazardous expansions of
nearly-execroot in the case of bazel.

Care must be taken to match directories exclusively.
The entire input tree is traversed for matches against expanded paths
under the root, to allow for nested selection.
Each match thus costs the number of input directories.
Counterintuitively, OutputFiles are augmented to avoid the recursive
check for OutputDirectories which only applies to actual reported
results, resulting in a path match when creating the exec root.
Regex style is java.util.Pattern, and must match the full input
directory.

Log execPath rather than the cache dir path

This will include the path to the missed directory and the operation
which required it.

Shore up OutputDirectory for silence on duplicates

Prevent adding duplicate realInputDirectories matches

Trigger realInputDirectories to have empty files

Ensure that the last leg of the execution presents a directory, rather
than the parent, per OutputDirectory's stamping.

Switch to positive check for linkInputDirectories

docs(configuration): document --prometheus_port CLI argument

docs(configuration): readability and typos

style(configuration.md): table formatting

feat: support --redis_uri command line option

Support a `--redis_uri` command line option for start-up.

docs(configuration): document the --redis_uri command line options

also fixed some spelling typos.

Example should use `container_image` instead of `java_image`

chore: bump rules_jvm_external

Bumping 4.2 -> 5.3

chore: bump rules_cc

Bump fro 0.0.6 -> 0.0.9

Implement local resources for workers (buildfarm#1282)

Suppress unused warning

Bump bazel version, otherwise some test fail with System::setSecurityManager

Revert bazel upgrade

New line at end of file

feat: Hot CAS Entries - Update read counts in Redis

feat: Hot CAS Entries - Final Integration

build: override grpc dependencies with our dependencies

Don't get transitive grpc dependencies, use the ones from our `maven_install(...)`

chore(deps): bump protobuf runtime to 3.19.1

chore(deps) add transitive dependencies

feat: add Proto reflection service to shard worker

To aid connection troubleshooting

Bug: Fix Blocked thread in WriteStreamObserver Caused by CASFile Write (buildfarm#1486)

* Add unit test
* Signal Write on complete

Pin the Java toolchain to `remotejdk_17` (buildfarm#1509)

Closes buildfarm#1508

Cleanups:
- remove the unused `ubuntu-bionic` base image
- replace `ubuntu-jammy:jammy-java11-gcc` with `ubuntu-mantic:mantic-java17-gcc`
- replace `amazoncorretto:19` with `ubuntu-mantic:mantic-java17-gcc`
- swap inverted log file names in a file

docs: add markdown language specifiers for code blocks

Support OutputPaths in OutputDirectory

Specifying any number of OutputPaths will ignore OutputFiles (consistent
with uploads). Where an OutputPath specifies an output directory, the
action must be able to create the directory itself.

Permit Absolute Symlink Targets with configuration

Partial specification of the absolute symlink response per REAPI.
Remaining work will be in output identification.

chore: update bazel to 6.4.0 (buildfarm#1513)

Trying to get more info on the Lombok stamping issue on Windows CI.
See also bazelbuild/bazel#10363 and
bazelbuild/bazel#18185

Rename instance types (buildfarm#1514)

Create SymlinkNode outputs during upload (buildfarm#1515)

Default disabled, available with createSymlinkOutputs option in Worker
config.

feat: Implement CAS lease extension (buildfarm#1455)

Problem

    Enabling the findMissingBlobsViaBackplane flag in BuildfarmServer eliminates the need for the BuildfarmWorker's fmb API call. This BuildfarmWorker:fmb call was also responsible for tracking CAS entry access. As result, our CAS cache eviction strategy shifted from LRU to FIFO.
    When the findMissingBlobsViaBackplane flag is enabled, the buildfarm relies on the backplane as the definitive source for CAS availability. Since we don't update CAS expiry on each access, the backplane will independently expire CAS entries based on the specified cas_expire duration, even if they are actively being read.

Solution

Updated bfServer:fmb call to perform non-blocking fmb calls to workers, allowing these workers to record access for the relevant CAS entries.

Extended expiry duration for available CAS entries in the backplane on each fmb call.

With these changes, we can utilize Bazel's experimental_remote_cache_lease_extension and experimental_remote_cache_ttl flags for incremental builds.

Closes buildfarm#1428

Bump org.json:json from 20230227 to 20231013 in /admin/main (buildfarm#1516)

Bumps [org.json:json](https://github.com/douglascrockford/JSON-java) from 20230227 to 20231013.
- [Release notes](https://github.com/douglascrockford/JSON-java/releases)
- [Changelog](https://github.com/stleary/JSON-java/blob/master/docs/RELEASES.md)
- [Commits](https://github.com/douglascrockford/JSON-java/commits)

---
updated-dependencies:
- dependency-name: org.json:json
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Re-add missing graceful shutdown functionality (buildfarm#1520)

Technically correct to unwrap EE on lock failure

Bump rules_oss_audit and patch for py3.11

Prevent healthStatusManager NPE on start failure

Consistent check for publicName presence

Read through external with query THROUGH=true

Specifying a correlated invocation id with a uri containing a
THROUGH=true query param will cause the CFC to read a blob through an
external input stream, populating locally along the way. This permits
client-based replication of blobs, and can enable N+1 replication and
traffic balancing for reads.

Add --port option to worker

Option to run the worker with a cmdline specification for its gRPC
server port.

Restore worker --root cmdline specification

Root cmdline specification has been broken since the config change of
v2.

Make bf-executor small blob names consistent

Remove the size identification for small blobs when uploading with
bf-executor.

feat: Hot CAS Entries - Update read counts in Redis

chore(deps): bump protobuf runtime to 3.19.1

chore(deps) add transitive dependencies

feat: add Proto reflection service to shard worker

To aid connection troubleshooting

Bug: Fix Blocked thread in WriteStreamObserver Caused by CASFile Write (buildfarm#1486)

* Add unit test
* Signal Write on complete

Pin the Java toolchain to `remotejdk_17` (buildfarm#1509)

Closes buildfarm#1508

Cleanups:
- remove the unused `ubuntu-bionic` base image
- replace `ubuntu-jammy:jammy-java11-gcc` with `ubuntu-mantic:mantic-java17-gcc`
- replace `amazoncorretto:19` with `ubuntu-mantic:mantic-java17-gcc`
- swap inverted log file names in a file

docs: add markdown language specifiers for code blocks

Support OutputPaths in OutputDirectory

Specifying any number of OutputPaths will ignore OutputFiles (consistent
with uploads). Where an OutputPath specifies an output directory, the
action must be able to create the directory itself.

chore: update bazel to 6.4.0 (buildfarm#1513)

Trying to get more info on the Lombok stamping issue on Windows CI.
See also bazelbuild/bazel#10363 and
bazelbuild/bazel#18185

Rename instance types (buildfarm#1514)

Create SymlinkNode outputs during upload (buildfarm#1515)

Default disabled, available with createSymlinkOutputs option in Worker
config.

feat: Implement CAS lease extension (buildfarm#1455)

Problem

    Enabling the findMissingBlobsViaBackplane flag in BuildfarmServer eliminates the need for the BuildfarmWorker's fmb API call. This BuildfarmWorker:fmb call was also responsible for tracking CAS entry access. As result, our CAS cache eviction strategy shifted from LRU to FIFO.
    When the findMissingBlobsViaBackplane flag is enabled, the buildfarm relies on the backplane as the definitive source for CAS availability. Since we don't update CAS expiry on each access, the backplane will independently expire CAS entries based on the specified cas_expire duration, even if they are actively being read.

Solution

Updated bfServer:fmb call to perform non-blocking fmb calls to workers, allowing these workers to record access for the relevant CAS entries.

Extended expiry duration for available CAS entries in the backplane on each fmb call.

With these changes, we can utilize Bazel's experimental_remote_cache_lease_extension and experimental_remote_cache_ttl flags for incremental builds.

Closes buildfarm#1428

Bump org.json:json from 20230227 to 20231013 in /admin/main (buildfarm#1516)

Bumps [org.json:json](https://github.com/douglascrockford/JSON-java) from 20230227 to 20231013.
- [Release notes](https://github.com/douglascrockford/JSON-java/releases)
- [Changelog](https://github.com/stleary/JSON-java/blob/master/docs/RELEASES.md)
- [Commits](https://github.com/douglascrockford/JSON-java/commits)

---
updated-dependencies:
- dependency-name: org.json:json
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Re-add missing graceful shutdown functionality (buildfarm#1520)

Technically correct to unwrap EE on lock failure

Bump rules_oss_audit and patch for py3.11

Prevent healthStatusManager NPE on start failure

Consistent check for publicName presence

Read through external with query THROUGH=true

Specifying a correlated invocation id with a uri containing a
THROUGH=true query param will cause the CFC to read a blob through an
external input stream, populating locally along the way. This permits
client-based replication of blobs, and can enable N+1 replication and
traffic balancing for reads.

Add --port option to worker

Option to run the worker with a cmdline specification for its gRPC
server port.

Restore worker --root cmdline specification

Root cmdline specification has been broken since the config change of
v2.

Make bf-executor small blob names consistent

Remove the size identification for small blobs when uploading with
bf-executor.

Configured output size operation failure

Permit installations to control the failure process for operations which
produce outputs larger than the maxEntrySizeBytes. A default value
of false retains the existing behavior which appears transient and
blacklists the executed action key. When enabled, the action will fail
under an invalid violation that indicates user error.

Restore abbrev port as -p

Update zstd-jni for latest version

There's been a few releases of it by now and this pulls the latest. For
buildfarm, notable changes included performance enhancments during
decompression.

See:
https://github.com/facebook/zstd/releases/tag/v1.5.5

Attempt to resolve windows stamping

Bug: Fix workerSet update logic for RemoteCasWriter

Detail storage requirements

Update for further docs related to storage+type functionality
Remove outdated Operation Queue worker definitions

Fix worker execution env title

Add storage example descriptions

Check for context cancelled before responding to error (buildfarm#1526)

When a write fails because the write was already cancelled before due to something like deadline exceeded, we get an unknown error. The exception comes from here and when it gets to errorResponse(), it only checks if status code is cancelled. In this case the status code is unknown, so we need to check if context is cancelled to prevent responseObserver from being invoked

The code change adds checking if context is cancelled and a unit test testing when the exception has context cancelled.

chore(deps): bump com.google.errorprone:error-prone

Release notes: https://github.com/google/error-prone/releases/tag/v2.22.0

Write logs and cleaup

Run formatter

Fix main merge

remove cleanup

Minor updates

Worker name execution properties matching

 updates

 updates

 updates

 updates

 updates

Update ShardWorkerContext.java

Update ShardWorkerContext.java

Release resources when not keeping an operation (buildfarm#1535)

Update queues.md

Refer to new camelized DMS fields.
Expand predefined dynamic execution property name matches.

Implement custom label header support for Grpc metrics interceptor (buildfarm#1530)

Add an option to provide a list of custom label headers to add to metrics.

Specify direct guava dependency usage (buildfarm#1538)

Testing with bazel HEAD using jdk21 compilation has revealed new
direct dependencies on guava.

Update lombok dependency for jdk21 (buildfarm#1540)

Annotations under lombok were fixed for jdk21 in 1.18.28, update to
current.

Reorganize DequeueMatchEvaluator (buildfarm#1537)

Remove acceptEverything DequeueMatchSetting
Place worker name in workerProvisions
Only enable allowUnmatched effects on key mismatch
Only acquire resources after asserting compatibility
Update documentation to match changes

Upgrade com_google_protobuf for jvm compatibility (buildfarm#1539)

Correct deprecated AccessController usage warning
Requires a newer bazel than 6.4.0 for macos to choose unix toolchain with C++ std=c++14 specification for protobuf->absl dependency.

Create buildfarm-worker-base-build-and-deploy.yml (buildfarm#1534)

Create a github workflow to build base buildfarm worker image.

Add base image generation scripts (buildfarm#1532)

Fix buildfarm-worker-base-build-and-deploy.yml (buildfarm#1541)

Add public buildfarm image generation actions (buildfarm#1542)

Update base image building action (buildfarm#1544)

Add release image generation action (buildfarm#1545)

Limit workflow to canonical repository (buildfarm#1547)

Check for "cores" exec property as min-cores match (buildfarm#1548)

The execution platform property "cores" is detailed in documentation as
specifying "min-cores" and "max-cores". Match this definition and
prevent "cores" from being evaluated as a strict match with the worker
provision properties (with likely rejection).

Consider output_* as relative to WD (buildfarm#1550)

Per the REAPI spec:

`The paths are relative to the working directory of the action
execution.`

Prefix the WorkingDirectory to paths used as OutputDirectory parameters,
and verify that these are present in the layout of the directory for
use.

Implement Persistent Workers as an execution path (buildfarm#1260)

Followup to buildfarm#1195

Add a new execution pathway in worker/Executor.java to use persistent workers via PersistentExecutor, like DockerExecutor.

Mostly unchanged from the form we used to experiment back at Twitter, but now with tests.

Co-authored-by: Shane Delmore [email protected]

Locate Output Paths relative to WorkingDirectory (buildfarm#1553)

* Locate Output Paths relative to WorkingDirectory

Required as a corollary to OutputDirectory changes to consider outputs
as relative to working directory.

* Windows builds emit relativize paths with native separators

Remove incorrect external resolve of WD on upload (buildfarm#1554)

Previous patch included a change in actionRoot parameter, expecting it
to prefer the working directory rooted path to discover outputs. Might
want to reapply this later, but for now leave the resolution in
uploadOutputs.

empty
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Windows Windows-specific issues and feature requests P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: bug
Projects
None yet
Development

No branches or pull requests

9 participants