-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement extra resources for test actions #13996
Conversation
Picking up #11963. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally code looks good!
But I mentioned some changes about constructors and namings.
/** How many extra resources an action requires for execution. */ | ||
public static final ParseableRequirement RESOURCES = | ||
ParseableRequirement.create( | ||
"resources:<str>:<int>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we are using int
instead of float
? On ResourceSet
we are using ImmutableMap<String, Float> extraResourceUsage
for this purposes. Also resources could be float, e.g. cpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to match the cpu:<int>
code above. Looks like there is some inconsistency around what can be a float and what can be an int:
- Execution requirements:
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
parsesint
. - Test tags:
src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java
parsesFloat
. - Command line flags:
src/main/java/com/google/devtools/build/lib/util/ResourceConverter.java
parsesint
or aFloat
in the expression multiplier:
/** Description of the accepted inputs to the converter. */
public static final String FLAG_SYNTAX =
"an integer, or a keyword (\"auto\", \"HOST_CPUS\", \"HOST_RAM\"), optionally followed by "
+ "an operation ([-|*]<float>) eg. \"auto\", \"HOST_CPUS*.5\"";
In its current form, this change is matching the existing code: integers in execution requirements, floats in test tags, integer on the command line, and floats in the resource manager code. But I'm open to suggestions if we want to use slightly different conventions than what is used for CPU resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
floats feel like they'd be more flexible, I can make the change here in #16785.
} | ||
|
||
if (value < 1) { | ||
return "can't be zero or negative"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why some of values could not be zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also just to match the cpu:<int>
parsing logic above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike with CPUs (that are required to orchestrate tests) it's likely GPU or embedded target based tests will be in test suites with unit tests that don't require newly defined resources. This should be changed to accept values of 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that we could accept zero values
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
Show resolved
Hide resolved
@@ -43,12 +45,24 @@ | |||
/** The number of CPUs, or fractions thereof. */ | |||
private final double cpuUsage; | |||
|
|||
/** Map of extra resources mapping name of the resource to a value. */ | |||
private final ImmutableMap<String, Float> extraResourceUsage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please mention what type of resources it could be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
private static final ResourceSet MEDIUM_RESOURCES = ResourceSet.create(100, 1, 1); | ||
private static final ResourceSet LARGE_RESOURCES = ResourceSet.create(300, 1, 1); | ||
private static final ResourceSet ENORMOUS_RESOURCES = ResourceSet.create(800, 1, 1); | ||
private static final ResourceSet SMALL_RESOURCES = ResourceSet.create(20, 1, null, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could ignore null
field here. We have the constructor with only 3 fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -422,6 +477,6 @@ synchronized int getWaitCount() { | |||
|
|||
@VisibleForTesting | |||
synchronized boolean isAvailable(double ram, double cpu, int localTestCount) { | |||
return areResourcesAvailable(ResourceSet.create(ram, cpu, localTestCount)); | |||
return areResourcesAvailable(ResourceSet.create(ram, cpu, null, localTestCount)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could ignore null
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
this(memoryMb, cpuUsage, null, localTestCount); | ||
} | ||
|
||
private ResourceSet(double memoryMb, double cpuUsage, ImmutableMap<String, Float> extraResourceUsage, int localTestCount) { | ||
this.memoryMb = memoryMb; | ||
this.cpuUsage = cpuUsage; | ||
if (extraResourceUsage == null) { | ||
this.extraResourceUsage = ImmutableMap.of(); | ||
} else { | ||
this.extraResourceUsage = extraResourceUsage; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this checks on null looks overcomplicated and it's better to avoid nulls in code. So I suggest mark extraResourceUsage
in second constructor as @NonNull
and use ImmutableMap.of()
in the first constructor.
Also it's better to use 3 param constructor on other code where it's possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems to work with a couple of caveats:
- I had to add new dependencies in
- Using
@NonNull
decorator inResourceSet.create
resulted in//src/test/shell/bazel:bazel_bootstrap_distfile_tar_test
test failures witherror: No generator for: (@io.reactivex.rxjava3.annotations.NonNull :: com.google.common.collect.ImmutableMap<java.lang.String,java.lang.Float>)
(see CI job here). I removed@NonNull
from that function and only kept in the constructor - do you think that's OK?
double memoryMb, double cpuUsage, int localTestCount) { | ||
if (memoryMb == 0 && cpuUsage == 0 && localTestCount == 0) { | ||
double memoryMb, double cpuUsage, ImmutableMap<String, Float> extraResourceUsage, int localTestCount) { | ||
if (memoryMb == 0 && cpuUsage == 0 && (extraResourceUsage == null || extraResourceUsage.size() == 0) && localTestCount == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove check on null.
help = | ||
"Set the number of extra resources available to Bazel. " | ||
+ "Takes in a string-float pair. Can be used multiple times to specify multiple " | ||
+ "types of extra resources. Bazel will limit concurrently running test actions " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned test actions here. If this param only for tests, then it's better to add test
in its name. E.g. local_test_extra_resources
. This is also applies to name of this field in classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess the implementation should not be restricted to tests, although I have not verified that it works for non-test actions as well. Let me investigate this a bit more. Would be nice to match the behavior and naming of --local_cpu_resources
and --local_ram_resources
which also apply to both test and build actions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this resources are implied not only for test actions. Could you please then remove test
from description or use text about tests as an example.
for (Map.Entry<String, Float> resource : resources.getExtraResourceUsage().entrySet()) { | ||
String key = (String)resource.getKey(); | ||
float used = (float)usedExtraResources.getOrDefault(key, 0f); | ||
float requested = resource.getValue(); | ||
float available = (float)availableResources.getExtraResourceUsage().getOrDefault(key, 0f); | ||
float epsilon = 0.0001f; // Account for possible rounding errors. | ||
if (requested != 0.0 && used != 0.0 && requested + used > available + epsilon) { | ||
extraResourcesIsAvailable = false; | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's better to move this code in function to make code more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @wilwell! Replied to some of the comments, will address the rest of them in the next patch I push.
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
Show resolved
Hide resolved
/** How many extra resources an action requires for execution. */ | ||
public static final ParseableRequirement RESOURCES = | ||
ParseableRequirement.create( | ||
"resources:<str>:<int>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just to match the cpu:<int>
code above. Looks like there is some inconsistency around what can be a float and what can be an int:
- Execution requirements:
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
parsesint
. - Test tags:
src/main/java/com/google/devtools/build/lib/analysis/test/TestTargetProperties.java
parsesFloat
. - Command line flags:
src/main/java/com/google/devtools/build/lib/util/ResourceConverter.java
parsesint
or aFloat
in the expression multiplier:
/** Description of the accepted inputs to the converter. */
public static final String FLAG_SYNTAX =
"an integer, or a keyword (\"auto\", \"HOST_CPUS\", \"HOST_RAM\"), optionally followed by "
+ "an operation ([-|*]<float>) eg. \"auto\", \"HOST_CPUS*.5\"";
In its current form, this change is matching the existing code: integers in execution requirements, floats in test tags, integer on the command line, and floats in the resource manager code. But I'm open to suggestions if we want to use slightly different conventions than what is used for CPU resources?
} | ||
|
||
if (value < 1) { | ||
return "can't be zero or negative"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also just to match the cpu:<int>
parsing logic above.
help = | ||
"Set the number of extra resources available to Bazel. " | ||
+ "Takes in a string-float pair. Can be used multiple times to specify multiple " | ||
+ "types of extra resources. Bazel will limit concurrently running test actions " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess the implementation should not be restricted to tests, although I have not verified that it works for non-test actions as well. Let me investigate this a bit more. Would be nice to match the behavior and naming of --local_cpu_resources
and --local_ram_resources
which also apply to both test and build actions?
736c7a1
to
4478fca
Compare
Add support for user-specified resource types in the resource manager. This generalizes the CPU, RAM and "test count" resource support for other resource types such as the number of GPUs, available GPU memory, the number of embedded devices connected to the host, etc. The available amount of extra resources can be specified using the new --local_extra_resources=<resourcename>=<amount> command line flag, which is analoguous to the existing --local_cpu_resources and --local_memory_resources flags. Tests can then declare the amount of extra resources they need by using a "resources:<resourcename>:<amount>" tag.
4478fca
to
195fcd2
Compare
@scele thanks for your effort in getting this ready! I would be very excited to see this land; I see a lot of broad value in having full support for declaring resources needed for tests. Do you plan to pick this back up and greening it up? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please sync the changes from the repo. These files have a lot of changes during last month.
@@ -137,6 +141,10 @@ public static ResourceManager instance() { | |||
// definition in the ResourceSet class. | |||
private double usedRam; | |||
|
|||
// Used amount of extra resources. Corresponds to the extra resource | |||
// definition in the ResourceSet class. | |||
private Map<String, Float> usedExtraResources; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please sync the changes from the repository. This file had significant changes during last month.
help = | ||
"Set the number of extra resources available to Bazel. " | ||
+ "Takes in a string-float pair. Can be used multiple times to specify multiple " | ||
+ "types of extra resources. Bazel will limit concurrently running test actions " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this resources are implied not only for test actions. Could you please then remove test
from description or use text about tests as an example.
@@ -124,6 +124,7 @@ java_library( | |||
"//third_party:flogger", | |||
"//third_party:guava", | |||
"//third_party:jsr305", | |||
"//third_party:rxjava3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this library?
} | ||
|
||
if (value < 1) { | ||
return "can't be zero or negative"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that we could accept zero values
src/main/java/com/google/devtools/build/lib/actions/ExecutionRequirements.java
Show resolved
Hide resolved
return ImmutableMap.copyOf(extraResources); | ||
} | ||
|
||
public ResourceSet getLocalResourceUsage(Label label, boolean usingLocalTestJobs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please reaarange the methods to make the diff between two version more clear?
Hello @scele, Can you please resolve conflicts and above comments for the requested changes. Thank you! |
We are marking the above PR as stale because it has not had any recent activity from many days. It will be closed in 7 days if there is no further activity occurs. Thank you. |
Yeah, sorry, I'm on vacation now and can't really commit to any finite date to push this to the finish line. So closing this PR for now. |
Addressed comments on bazelbuild#13996 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
Addressed comments on bazelbuild#13996 Fixed issues in tests and built and tested with lowRISC/opentitan#16436 Signed-off-by: Drew Macrae <[email protected]>
This recreates a [closed PR](#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:#16817 Closes #16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68
This recreates a [closed PR](bazelbuild#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:bazelbuild#16817 Closes bazelbuild#16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68
This recreates a [closed PR](#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:#16817 Closes #16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68 Co-authored-by: kshyanashree <[email protected]>
This recreates a [closed PR](#13996) to implement extra resources which we're hoping to use in lowRISC/opentitan#16436 Fixes:#16817 Closes #16785. PiperOrigin-RevId: 498557024 Change-Id: I60d8f8f4a4a02748147cabb4cd60a2a9b95a2c68
Add support for user-specified resource types in the resource manager.
This generalizes the CPU, RAM and "test count" resource support for
other resource types such as the number of GPUs, available GPU memory,
the number of embedded devices connected to the host, etc.
The available amount of extra resources can be specified using the new
--local_extra_resources=<resourcename>=<amount>
command line flag, whichis analoguous to the existing
--local_cpu_resources
and--local_memory_resources
flags.Tests can then declare the amount of extra resources they need by using
a
resources:<resourcename>:<amount>
tag.