-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive Load FakeStepController doom update #492
Adaptive Load FakeStepController doom update #492
Conversation
update from master
merge from upstream
merge from upstream
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
…double Signed-off-by: eric846 <[email protected]>
…variable_setter Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
merge from upstream
…ent.Output turns out not to include the status Signed-off-by: eric846 <[email protected]>
…plugin names, log thresholds only once per session Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly one big question that I want more clarity on.
nighthawk::client::CommandLineOptions command_line_options_template) | ||
: is_converged_{false}, is_doomed_{false}, fixed_rps_value_{config.fixed_rps_value()}, | ||
: input_setting_failure_countdown_{config.artificial_input_setting_failure_countdown()}, | ||
config_{std::move(config)}, is_converged_{false}, is_doomed_{false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a complaint, but what is std::move accomplishing here, given that this isn't a smart pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny story, this was actually required by clang-tidy. If the function implementation is making a copy of a const reference parameter, clang-tidy tells you to change the parameter to be pass by value so it's obvious to the caller that a copy is happening. The argument gets copied at the call site into an unnamed temporary value. Then in the constructor we can actually move this temporary value into the field without incurring the cost of a second copy.
int32 fixed_rps_value = 1; | ||
// Artificial error that the plugin factory should return during validation. Optional. | ||
google.rpc.Status artificial_validation_failure = 2; | ||
// Artificial error that should be returned from GetCurrentCommandLineOptions(). Optional. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we enhance this comment to indicate the relation to the countdown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
int32 fixed_rps_value = 1; | ||
// Artificial error that the plugin factory should return during validation. Optional. | ||
google.rpc.Status artificial_validation_failure = 2; | ||
// Artificial error that should be returned from GetCurrentCommandLineOptions(). Optional. | ||
google.rpc.Status artificial_input_setting_failure = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm having trouble understanding the full use case here. If this is 3, then we are going to succeed 3 times, then fail on the 4th attempt. Makes sense, but I'm not sure I understand why that's useful. What is the test that you're supporting by creating it in this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetCurrentCommandLine()
can return an error status that should be handled cleanly by the main controller loop. The controller calls GetCurrentCommandLine()
repeatedly during the adjusting stage until the step controller says it's converged, and then in the testing stage we call GetCurrentCommandLine()
one last time, reusing the last converged value.
In order to test handling of these errors, we need the FakeStepController
to return successfully from GetCurrentCommandLine()
during the adjusting stage, but then start returning errors just in time for the testing stage. We don't have any way to update the FakeStepController
during the run, so it has to somehow be programmed up front to behave differently at different times.
An alternative would be for magic values in UpdateAndRecompute()
to trigger GetCurrentCommandLine()
error behavior. We already use magic values to control convergence and doom. But there's only so much information we can encode in metric score double
s without having it get out of hand. Currently the UpdateAndRecompute()
behavior is: zero scores=non-converged non-doomed, any positive score=converged, any negative score=doomed.
This trick wouldn't be necessary if the step controller were aware of what stage it was operating in.
@mum4k Can you give this a review and assign back to me when done? |
@eric846 please address / respond to the comments from @dubious90 as that can help me by providing additional context for the review. |
Signed-off-by: eric846 <[email protected]>
…date Signed-off-by: eric846 <[email protected]>
Signed-off-by: eric846 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a const nit, LGTM otherwise. Left a question that popped up over at #466 (comment) as that's out of scope here.
test/adaptive_load/fake_plugins/fake_step_controller/fake_step_controller_test.cc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with just a few nits.
* | ||
* @param fixed_rps_value Value for RPS to set in the FakeStepControllerConfig proto until the | ||
* countdown reaches zero. | ||
* @param artificial_input_setting_failure An error status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we expand this comment, explaining what is the meaning of the error status, i.e. what it is used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -22,7 +22,7 @@ class FakeStepController : public StepController { | |||
* @param config FakeStepControllerConfig proto for setting the fixed RPS value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(pre-existing, optional) We should probably mention in the class comment that this class isn't thread-safe. At least I am assuming it isn't meant to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: eric846 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for resolving the const nit. LGTM for my part of the review.
Updates FakeStepController:
FakeStepController::UpdateAndRecompute()
receives aBenchmarkResult
with a negative metric score. PreviouslyBenchmarkResult
could contain an error status, and we used the error status to trigger simulated doom. Now errors do not get passed toStepController::UpdateAndRecompute()
, so we need a new mechanism.InputVariableSetter
failure inGetCurrentCommandLineOptions()
. The artificial failure is specified in the config proto.GetCurrentCommandLineOptions()
returns successfully untilUpdateAndRecompute()
has been called a configured number of times, then starts returning the simulated input variable setter failure mentioned above. This simple autonomous behavior is needed to test error handling in the testing stage without triggering an early exit in the adjusting stage; the adaptive load controller can't be paused in the middle of the test to reconfigure theFakeStepController
to change its behavior.The behavior of
FakeStepController
after anUpdateAndRecompute()
with aBenchmarkResult
containing the given metric scores:Part 3 of splitting PR #483.