-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adaptive Load metrics evaluator library #495
Changes from 48 commits
8ea442d
5ac755a
b8c25a5
1c19c68
7050686
0776563
16fd8f6
c383010
6e1a483
4ef1140
4111bf4
871a959
1fd77c1
edc36b2
4d0364e
aed6d94
d9ae87d
a05a6f5
8cd4d21
d814a96
5f5a885
7e20a78
9048267
306c0ec
d33f543
442cca9
677b783
cefb366
f3684df
5463051
46e0e25
f634642
3c39faa
b9c8f2b
5fc4db4
64e7852
12807f1
e8e960f
7a5cc6d
2090763
c8dee61
b1e8ea8
f0595f7
6306b4e
1ece783
ee1bf99
d61db72
283965f
ea9f562
70705e9
e576bc1
1fca528
f663975
f367120
93f42dc
d2e502f
ed32856
eecf00d
788fa07
34b81da
edc82db
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
#include "envoy/config/core/v3/base.pb.h" | ||
|
||
#include "nighthawk/adaptive_load/metrics_plugin.h" | ||
|
||
#include "external/envoy/source/common/common/logger.h" | ||
#include "external/envoy/source/common/common/statusor.h" | ||
#include "external/envoy/source/common/protobuf/protobuf.h" | ||
|
||
#include "api/adaptive_load/adaptive_load.pb.h" | ||
#include "api/adaptive_load/benchmark_result.pb.h" | ||
#include "api/adaptive_load/metric_spec.pb.h" | ||
#include "api/client/options.pb.h" | ||
#include "api/client/output.pb.h" | ||
#include "api/client/service.pb.h" | ||
|
||
#include "absl/container/flat_hash_map.h" | ||
#include "absl/status/status.h" | ||
#include "absl/strings/str_join.h" | ||
|
||
namespace Nighthawk { | ||
|
||
/** | ||
* An interface with utilities for translating between metrics definitions, thresholds, scores, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment seems to be focused on what it works with, rather than what it actually does? It's possible this will be obvious after the usage is included, but can we try to clarify its purpose if not There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be clearer now. |
||
* MetricsPlugins, and Nighthawk Service results. | ||
*/ | ||
class MetricsEvaluator { | ||
public: | ||
virtual ~MetricsEvaluator() = default; | ||
|
||
/** | ||
* Given a MetricSpec, obtains a single metric value from the MetricPlugin and optionally scores | ||
* it according to a threshold and scoring function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this would be easier to follow if we said according to a ThresholdSpec. Then one would look for the threshold spec in the parameters, and see that it is a threshold with a scoring function. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
* | ||
* @param metric_spec The metric spec identifying the metric by name and plugin name. | ||
* @param metrics_plugin An already activated MetricsPlugin used by the metric_spec. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The word activated here is confusing to me. I'm not sure it means: Two questions:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dropped that wording, it should be clearer now. |
||
* @param threshold_spec A proto describing the threshold and scoring function. Nullptr if the | ||
* metric is informational only. | ||
dubious90 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* | ||
* @return StatusOr<MetricEvaluation> A proto containing the metric value (and its score if a | ||
* threshold was specified), or an error status if the metric could not be obtained from the | ||
* MetricsPlugin. | ||
*/ | ||
virtual absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation> | ||
EvaluateMetric(const nighthawk::adaptive_load::MetricSpec& metric_spec, | ||
MetricsPlugin& metrics_plugin, | ||
const nighthawk::adaptive_load::ThresholdSpec* threshold_spec) const PURE; | ||
|
||
/** | ||
* Extracts metric descriptors and corresponding thresholds from a top-level adaptive load session | ||
* spec to an ordered list and a map. Allows for uniform treatment of scored and informational | ||
* metrics. | ||
* | ||
* @param spec The adaptive load session spec. | ||
* @param metric_specs A list to store extracted MetricSpecs in order of definition. | ||
* @param threshold_spec_from_metric_spec A map to store each MetricSpec and its threshold if it | ||
* had one, or nullptr if it was an informational metric. | ||
*/ | ||
virtual void | ||
ExtractMetricSpecs(const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, | ||
std::vector<const nighthawk::adaptive_load::MetricSpec*>& metric_specs, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if instead of having two output args, a single one consisting of a vector of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternatively, could we not just return the map, and retrieve the keys specifically if we need that? I also don't like having two separate data structures that are supposed to be in sync with each other implicitly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason for the two separate structures is that I couldn't find a standard map that preserves the order of insertion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated both data structures to be |
||
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*, | ||
const nighthawk::adaptive_load::ThresholdSpec*>& | ||
threshold_spec_from_metric_spec) const PURE; | ||
|
||
/** | ||
* Analyzes a Nighthawk Service benchmark against configured MetricThresholds. Queries | ||
* outside MetricsPlugins for current metric values, and/or uses "nighthawk.builtin" plugin to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These feel like implementation details to me. Is nighthawk.builtin a MetricsPlugin? If so, I don't think this sentence ("Queries outside MetricsPlugins") is necessary. The documentation for nighthawk.builtin should rest within that plugin itself, and this would imply we only have one internal MetricsPlugin, which I don't think is a good assumption to make for the future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dropping this discussion of the builtin plugin. Mentioned the nighthawk.builtin in some of the parameter descriptions. |
||
* extract stats and counters from the latest Nighthawk Service output. The Nighthawk benchmark is | ||
* assumed to have finished recently so values from MetricsPlugins will be relevant. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I understand this, I think it might be better phrased as the following. (Please feel free to edit or push back): There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
* | ||
* @param nighthawk_response Proto returned from Nighthawk Service describing the latest single | ||
* benchmark session. | ||
* @param spec Top-level proto defining the adaptive load session. | ||
* @param name_to_custom_metrics_plugin_map Common map from plugin names to MetricsPlugins, loaded | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the second clause of this is describing what the caller does, right? I don't think we should be documenting the calling code from this function. (What if it's called by something else in the future, especially since this is an interface, rather than the impl file) If there are constraints around this, it might be better to describe it from this function's perspective: But if that's a constraint here, that sounds like an architectural oddity I'd have questions about. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, dropped the information about how the caller operates and and added a description of the true constraints. |
||
* and initialized once at the beginning of the session and passed to all calls of this method. | ||
* | ||
* @return StatusOr<BenchmarkResult> A proto containing all metric scores for this Nighthawk | ||
* Service benchmark session, or an error propagated from MetricsPlugins. | ||
*/ | ||
virtual absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult> | ||
AnalyzeNighthawkBenchmark(const nighthawk::client::ExecutionResponse& nighthawk_response, | ||
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, | ||
const absl::flat_hash_map<std::string, MetricsPluginPtr>& | ||
name_to_custom_metrics_plugin_map) const PURE; | ||
}; | ||
|
||
} // namespace Nighthawk |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
#include "adaptive_load/metrics_evaluator_impl.h" | ||
|
||
#include "adaptive_load/metrics_plugin_impl.h" | ||
#include "adaptive_load/plugin_loader.h" | ||
|
||
namespace Nighthawk { | ||
|
||
absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation> MetricsEvaluatorImpl::EvaluateMetric( | ||
const nighthawk::adaptive_load::MetricSpec& metric_spec, MetricsPlugin& metrics_plugin, | ||
const nighthawk::adaptive_load::ThresholdSpec* threshold_spec) const { | ||
nighthawk::adaptive_load::MetricEvaluation evaluation; | ||
evaluation.set_metric_id( | ||
absl::StrCat(metric_spec.metrics_plugin_name(), "/", metric_spec.metric_name())); | ||
const absl::StatusOr<double> metric_value_or = | ||
metrics_plugin.GetMetricByName(metric_spec.metric_name()); | ||
if (!metric_value_or.ok()) { | ||
return absl::Status(static_cast<absl::StatusCode>(metric_value_or.status().code()), | ||
absl::StrCat("Error calling MetricsPlugin '", | ||
metric_spec.metrics_plugin_name(), ": ", | ||
metric_value_or.status().message())); | ||
} | ||
const double metric_value = metric_value_or.value(); | ||
evaluation.set_metric_value(metric_value); | ||
if (threshold_spec == nullptr) { | ||
// Informational metric. | ||
evaluation.set_weight(0.0); | ||
} else { | ||
evaluation.set_weight(threshold_spec->weight().value()); | ||
absl::StatusOr<ScoringFunctionPtr> scoring_function_or = | ||
LoadScoringFunctionPlugin(threshold_spec->scoring_function()); | ||
RELEASE_ASSERT(scoring_function_or.ok(), | ||
absl::StrCat("ScoringFunction plugin loading error should have been caught " | ||
"during input validation: ", | ||
scoring_function_or.status().message())); | ||
ScoringFunctionPtr scoring_function = std::move(scoring_function_or.value()); | ||
evaluation.set_threshold_score(scoring_function->EvaluateMetric(metric_value)); | ||
} | ||
return evaluation; | ||
} | ||
|
||
void MetricsEvaluatorImpl::ExtractMetricSpecs( | ||
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, | ||
std::vector<const nighthawk::adaptive_load::MetricSpec*>& metric_specs, | ||
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*, | ||
const nighthawk::adaptive_load::ThresholdSpec*>& | ||
threshold_spec_from_metric_spec) const { | ||
for (const nighthawk::adaptive_load::MetricSpecWithThreshold& metric_threshold : | ||
spec.metric_thresholds()) { | ||
metric_specs.push_back(&metric_threshold.metric_spec()); | ||
threshold_spec_from_metric_spec[&metric_threshold.metric_spec()] = | ||
&metric_threshold.threshold_spec(); | ||
} | ||
for (const nighthawk::adaptive_load::MetricSpec& metric_spec : | ||
spec.informational_metric_specs()) { | ||
metric_specs.push_back(&metric_spec); | ||
threshold_spec_from_metric_spec[&metric_spec] = nullptr; | ||
} | ||
} | ||
|
||
absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult> | ||
MetricsEvaluatorImpl::AnalyzeNighthawkBenchmark( | ||
const nighthawk::client::ExecutionResponse& nighthawk_response, | ||
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, | ||
const absl::flat_hash_map<std::string, MetricsPluginPtr>& name_to_custom_metrics_plugin_map) | ||
const { | ||
if (nighthawk_response.error_detail().code() != static_cast<int>(absl::StatusCode::kOk)) { | ||
dubious90 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return absl::Status(static_cast<absl::StatusCode>(nighthawk_response.error_detail().code()), | ||
nighthawk_response.error_detail().message()); | ||
} | ||
|
||
nighthawk::adaptive_load::BenchmarkResult benchmark_result; | ||
*benchmark_result.mutable_nighthawk_service_output() = nighthawk_response.output(); | ||
|
||
// A map containing all available MetricsPlugins: preloaded custom plugins shared across all | ||
// benchmarks, and a freshly instantiated builtin plugin for this benchmark only. | ||
absl::flat_hash_map<std::string, MetricsPlugin*> name_to_plugin_map; | ||
for (const auto& name_plugin_pair : name_to_custom_metrics_plugin_map) { | ||
name_to_plugin_map[name_plugin_pair.first] = name_plugin_pair.second.get(); | ||
} | ||
auto builtin_plugin = | ||
std::make_unique<NighthawkStatsEmulatedMetricsPlugin>(nighthawk_response.output()); | ||
name_to_plugin_map["nighthawk.builtin"] = builtin_plugin.get(); | ||
|
||
// MetricSpecs in original order of definition. | ||
std::vector<const nighthawk::adaptive_load::MetricSpec*> metric_specs; | ||
// Pointer to the corresponding ThresholdSpec, or nullptr for informational metrics. | ||
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*, | ||
const nighthawk::adaptive_load::ThresholdSpec*> | ||
threshold_spec_from_metric_spec; | ||
ExtractMetricSpecs(spec, metric_specs, threshold_spec_from_metric_spec); | ||
|
||
std::vector<std::string> errors; | ||
for (const nighthawk::adaptive_load::MetricSpec* metric_spec : metric_specs) { | ||
absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation> evaluation_or = | ||
EvaluateMetric(*metric_spec, *name_to_plugin_map[metric_spec->metrics_plugin_name()], | ||
threshold_spec_from_metric_spec[metric_spec]); | ||
if (!evaluation_or.ok()) { | ||
errors.emplace_back(absl::StrCat("Error evaluating metric: ", evaluation_or.status().code(), | ||
": ", evaluation_or.status().message())); | ||
continue; | ||
} | ||
*benchmark_result.mutable_metric_evaluations()->Add() = evaluation_or.value(); | ||
} | ||
if (!errors.empty()) { | ||
return absl::InternalError(absl::StrJoin(errors, "\n")); | ||
} | ||
return benchmark_result; | ||
} | ||
|
||
} // namespace Nighthawk |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#include "nighthawk/adaptive_load/metrics_evaluator.h" | ||
|
||
namespace Nighthawk { | ||
|
||
class MetricsEvaluatorImpl : public MetricsEvaluator { | ||
public: | ||
absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation> | ||
EvaluateMetric(const nighthawk::adaptive_load::MetricSpec& metric_spec, | ||
MetricsPlugin& metrics_plugin, | ||
const nighthawk::adaptive_load::ThresholdSpec* threshold_spec) const override; | ||
|
||
void ExtractMetricSpecs(const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, | ||
std::vector<const nighthawk::adaptive_load::MetricSpec*>& metric_specs, | ||
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*, | ||
const nighthawk::adaptive_load::ThresholdSpec*>& | ||
threshold_spec_from_metric_spec) const override; | ||
|
||
absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult> | ||
AnalyzeNighthawkBenchmark(const nighthawk::client::ExecutionResponse& nighthawk_response, | ||
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec, | ||
const absl::flat_hash_map<std::string, MetricsPluginPtr>& | ||
name_to_custom_metrics_plugin_map) const override; | ||
}; | ||
|
||
} // namespace Nighthawk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a usage example here? I have other confusions about the API here, but if you feel like your example will resolve them, feel free to ignore other points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a better explanation of what this library does. Only one of the methods is expected to be called directly.