Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive Load metrics evaluator library #495

Merged
merged 61 commits into from
Sep 10, 2020
Merged
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
8ea442d
Merge pull request #5 from envoyproxy/master
eric846 Jun 1, 2020
5ac755a
Merge pull request #6 from envoyproxy/master
eric846 Jun 28, 2020
b8c25a5
Merge pull request #7 from envoyproxy/master
eric846 Jul 7, 2020
1c19c68
initial commit
eric846 Jul 9, 2020
7050686
fix comments
eric846 Jul 9, 2020
0776563
fix format
eric846 Jul 9, 2020
16fd8f6
rename adaptive_rps to adaptive_load
eric846 Jul 10, 2020
c383010
add field_selector in example
eric846 Jul 10, 2020
6e1a483
fix example comment
eric846 Jul 10, 2020
4ef1140
fix format
eric846 Jul 10, 2020
4111bf4
add support for fault injection headers
eric846 Jul 10, 2020
871a959
replace linear and binary search with exponential search
eric846 Jul 10, 2020
1fd77c1
add InputVariableSetter mechanism
eric846 Jul 11, 2020
edc36b2
add input variable setter to build file
eric846 Jul 11, 2020
4d0364e
fix syntax errors
eric846 Jul 11, 2020
aed6d94
rename samples/adaptive_rps
eric846 Jul 11, 2020
d9ae87d
improve comments, change step controller initial value from int64 to …
eric846 Jul 12, 2020
a05a6f5
add proto validation rules, fix comments, make rps the default input_…
eric846 Jul 13, 2020
8cd4d21
fix comment wording
eric846 Jul 13, 2020
d814a96
simplify protos, add defaults, specify required or optional
eric846 Jul 14, 2020
5f5a885
add missing newline
eric846 Jul 14, 2020
7e20a78
Kick CI
eric846 Jul 14, 2020
9048267
simplify protos
eric846 Jul 15, 2020
306c0ec
fix format
eric846 Jul 15, 2020
d33f543
fix some optional field comments and rules
eric846 Jul 15, 2020
442cca9
Merge pull request #10 from envoyproxy/master
eric846 Jul 16, 2020
677b783
add Nighthawk status field in BenchmarkResult as nested nighthawk.cli…
eric846 Jul 19, 2020
cefb366
switch to standard Envoy plugin config proto, add prefix to internal …
eric846 Jul 22, 2020
f3684df
Merge remote-tracking branch 'upstream/master' into adaptive-rps-protos2
eric846 Jul 22, 2020
5463051
create headers
eric846 Jul 22, 2020
46e0e25
fix format
eric846 Jul 22, 2020
f634642
use docstring format
eric846 Jul 22, 2020
3c39faa
fix typos in comments
eric846 Jul 23, 2020
b9c8f2b
split build target, get rid of ostream, change InputValueSetter to us…
eric846 Jul 24, 2020
5fc4db4
remove nested namespace, remove redundant _include in target names
eric846 Jul 26, 2020
64e7852
merge from upstream
eric846 Jul 29, 2020
12807f1
Merge remote-tracking branch 'upstream/master' into adaptive-rps-headers
eric846 Jul 29, 2020
e8e960f
merge from upstream
eric846 Aug 27, 2020
7a5cc6d
initial commit: MetricsEvaluator library
eric846 Aug 27, 2020
2090763
add class comment
eric846 Aug 27, 2020
c8dee61
fix format
eric846 Aug 27, 2020
b1e8ea8
fix comments
eric846 Aug 27, 2020
f0595f7
remove unused includes, try to fix strange clang-tidy-only compilatio…
eric846 Aug 27, 2020
6306b4e
Merge remote-tracking branch 'upstream/master' into master2
eric846 Aug 27, 2020
1ece783
Merge remote-tracking branch 'upstream/master' into master2
eric846 Aug 28, 2020
ee1bf99
Merge branch 'master2' into adaptive-rps-metric-evaluation
eric846 Aug 28, 2020
d61db72
fix clang-tidy
eric846 Aug 28, 2020
283965f
fix clang-tidy: move some includes to impl
eric846 Aug 28, 2020
ea9f562
change ExtractMetricSpecs output parameters to returned pair
eric846 Aug 29, 2020
70705e9
Merge remote-tracking branch 'upstream/master' into master2
eric846 Aug 31, 2020
e576bc1
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 1, 2020
1fca528
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 3, 2020
f663975
rename unit tests, fix compile error
eric846 Sep 3, 2020
f367120
make ExtractMetricSpecs return const containers
eric846 Sep 3, 2020
93f42dc
Merge branch 'master2' into adaptive-rps-metric-evaluation
eric846 Sep 3, 2020
d2e502f
remove bazelrc
eric846 Sep 3, 2020
ed32856
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 3, 2020
eecf00d
Merge remote-tracking branch 'upstream/master' into master2
eric846 Sep 8, 2020
788fa07
Merge branch 'master2' into adaptive-rps-metric-evaluation
eric846 Sep 8, 2020
34b81da
return vector of pairs from ExtractMetricSpecs
eric846 Sep 8, 2020
edc82db
improve comments
eric846 Sep 9, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions include/nighthawk/adaptive_load/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,21 @@ envoy_basic_cc_library(
],
)

envoy_basic_cc_library(
name = "metrics_evaluator",
hdrs = [
"metrics_evaluator.h",
],
include_prefix = "nighthawk/adaptive_load",
deps = [
":metrics_plugin",
"//api/adaptive_load:adaptive_load_proto_cc_proto",
"@envoy//include/envoy/common:base_includes",
"@envoy//include/envoy/config:typed_config_interface",
"@envoy//source/common/common:statusor_lib_with_external_headers",
],
)

envoy_basic_cc_library(
name = "metrics_plugin",
hdrs = [
Expand Down
87 changes: 87 additions & 0 deletions include/nighthawk/adaptive_load/metrics_evaluator.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
#include "envoy/config/core/v3/base.pb.h"

#include "nighthawk/adaptive_load/metrics_plugin.h"

#include "external/envoy/source/common/common/logger.h"
#include "external/envoy/source/common/common/statusor.h"
#include "external/envoy/source/common/protobuf/protobuf.h"

#include "api/adaptive_load/adaptive_load.pb.h"
#include "api/adaptive_load/benchmark_result.pb.h"
#include "api/adaptive_load/metric_spec.pb.h"
#include "api/client/options.pb.h"
#include "api/client/output.pb.h"
#include "api/client/service.pb.h"

#include "absl/container/flat_hash_map.h"
#include "absl/status/status.h"
#include "absl/strings/str_join.h"

namespace Nighthawk {

/**
* An interface with utilities for translating between metrics definitions, thresholds, scores,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a usage example here? I have other confusions about the API here, but if you feel like your example will resolve them, feel free to ignore other points.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a better explanation of what this library does. Only one of the methods is expected to be called directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be focused on what it works with, rather than what it actually does? It's possible this will be obvious after the usage is included, but can we try to clarify its purpose if not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be clearer now.

* MetricsPlugins, and Nighthawk Service results.
*/
class MetricsEvaluator {
public:
virtual ~MetricsEvaluator() = default;

/**
* Given a MetricSpec, obtains a single metric value from the MetricPlugin and optionally scores
* it according to a threshold and scoring function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be easier to follow if we said according to a ThresholdSpec. Then one would look for the threshold spec in the parameters, and see that it is a threshold with a scoring function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
* @param metric_spec The metric spec identifying the metric by name and plugin name.
* @param metrics_plugin An already activated MetricsPlugin used by the metric_spec.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word activated here is confusing to me. I'm not sure it means: Two questions:

  1. Would someone using this file usually already know what this means, making this less of a concern?
  2. Is the fact that the plugin would already need to be "activated" more or less obvious if you know what it means, making this over-specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped that wording, it should be clearer now.

* @param threshold_spec A proto describing the threshold and scoring function. Nullptr if the
* metric is informational only.
*
* @return StatusOr<MetricEvaluation> A proto containing the metric value (and its score if a
* threshold was specified), or an error status if the metric could not be obtained from the
* MetricsPlugin.
*/
virtual absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation>
EvaluateMetric(const nighthawk::adaptive_load::MetricSpec& metric_spec,
MetricsPlugin& metrics_plugin,
const nighthawk::adaptive_load::ThresholdSpec* threshold_spec) const PURE;

/**
* Extracts metric descriptors and corresponding thresholds from a top-level adaptive load session
* spec to an ordered list and a map. Allows for uniform treatment of scored and informational
* metrics.
*
* @param spec The adaptive load session spec.
* @param metric_specs A list to store extracted MetricSpecs in order of definition.
* @param threshold_spec_from_metric_spec A map to store each MetricSpec and its threshold if it
* had one, or nullptr if it was an informational metric.
*/
virtual void
ExtractMetricSpecs(const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
std::vector<const nighthawk::adaptive_load::MetricSpec*>& metric_specs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if instead of having two output args, a single one consisting of a vector of std::pair<>s with a mandatory metric + optional threshold could simplify this interface?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, could we not just return the map, and retrieve the keys specifically if we need that? I also don't like having two separate data structures that are supposed to be in sync with each other implicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the two separate structures is that I couldn't find a standard map that preserves the order of insertion.

Copy link
Contributor Author

@eric846 eric846 Sep 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated both data structures to be const so there is no way for them to get out of sync.

absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*,
const nighthawk::adaptive_load::ThresholdSpec*>&
threshold_spec_from_metric_spec) const PURE;

/**
* Analyzes a Nighthawk Service benchmark against configured MetricThresholds. Queries
* outside MetricsPlugins for current metric values, and/or uses "nighthawk.builtin" plugin to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These feel like implementation details to me.

Is nighthawk.builtin a MetricsPlugin? If so, I don't think this sentence ("Queries outside MetricsPlugins") is necessary. The documentation for nighthawk.builtin should rest within that plugin itself, and this would imply we only have one internal MetricsPlugin, which I don't think is a good assumption to make for the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping this discussion of the builtin plugin. Mentioned the nighthawk.builtin in some of the parameter descriptions.

* extract stats and counters from the latest Nighthawk Service output. The Nighthawk benchmark is
* assumed to have finished recently so values from MetricsPlugins will be relevant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this, I think it might be better phrased as the following. (Please feel free to edit or push back):
Assumes that the values from MetricsPlugins correspond timewise with the nighthawk benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

*
* @param nighthawk_response Proto returned from Nighthawk Service describing the latest single
* benchmark session.
* @param spec Top-level proto defining the adaptive load session.
* @param name_to_custom_metrics_plugin_map Common map from plugin names to MetricsPlugins, loaded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the second clause of this is describing what the caller does, right? I don't think we should be documenting the calling code from this function. (What if it's called by something else in the future, especially since this is an interface, rather than the impl file)

If there are constraints around this, it might be better to describe it from this function's perspective:
"Should not change between calls within one adaptive load session."

But if that's a constraint here, that sounds like an architectural oddity I'd have questions about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, dropped the information about how the caller operates and and added a description of the true constraints.

* and initialized once at the beginning of the session and passed to all calls of this method.
*
* @return StatusOr<BenchmarkResult> A proto containing all metric scores for this Nighthawk
* Service benchmark session, or an error propagated from MetricsPlugins.
*/
virtual absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult>
AnalyzeNighthawkBenchmark(const nighthawk::client::ExecutionResponse& nighthawk_response,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
const absl::flat_hash_map<std::string, MetricsPluginPtr>&
name_to_custom_metrics_plugin_map) const PURE;
};

} // namespace Nighthawk
21 changes: 21 additions & 0 deletions source/adaptive_load/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,27 @@ envoy_cc_library(
],
)

envoy_cc_library(
name = "metrics_evaluator_impl",
srcs = [
"metrics_evaluator_impl.cc",
],
hdrs = [
"metrics_evaluator_impl.h",
],
repository = "@envoy",
visibility = ["//visibility:public"],
deps = [
":metrics_plugin_impl",
":plugin_loader",
"//api/adaptive_load:adaptive_load_proto_cc_proto",
"//api/client:base_cc_proto",
"//include/nighthawk/adaptive_load:adaptive_load_controller",
"//include/nighthawk/adaptive_load:metrics_evaluator",
"//include/nighthawk/adaptive_load:scoring_function",
],
)

envoy_cc_library(
name = "metrics_plugin_impl",
srcs = [
Expand Down
110 changes: 110 additions & 0 deletions source/adaptive_load/metrics_evaluator_impl.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#include "adaptive_load/metrics_evaluator_impl.h"

#include "adaptive_load/metrics_plugin_impl.h"
#include "adaptive_load/plugin_loader.h"

namespace Nighthawk {

absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation> MetricsEvaluatorImpl::EvaluateMetric(
const nighthawk::adaptive_load::MetricSpec& metric_spec, MetricsPlugin& metrics_plugin,
const nighthawk::adaptive_load::ThresholdSpec* threshold_spec) const {
nighthawk::adaptive_load::MetricEvaluation evaluation;
evaluation.set_metric_id(
absl::StrCat(metric_spec.metrics_plugin_name(), "/", metric_spec.metric_name()));
const absl::StatusOr<double> metric_value_or =
metrics_plugin.GetMetricByName(metric_spec.metric_name());
if (!metric_value_or.ok()) {
return absl::Status(static_cast<absl::StatusCode>(metric_value_or.status().code()),
absl::StrCat("Error calling MetricsPlugin '",
metric_spec.metrics_plugin_name(), ": ",
metric_value_or.status().message()));
}
const double metric_value = metric_value_or.value();
evaluation.set_metric_value(metric_value);
if (threshold_spec == nullptr) {
// Informational metric.
evaluation.set_weight(0.0);
} else {
evaluation.set_weight(threshold_spec->weight().value());
absl::StatusOr<ScoringFunctionPtr> scoring_function_or =
LoadScoringFunctionPlugin(threshold_spec->scoring_function());
RELEASE_ASSERT(scoring_function_or.ok(),
absl::StrCat("ScoringFunction plugin loading error should have been caught "
"during input validation: ",
scoring_function_or.status().message()));
ScoringFunctionPtr scoring_function = std::move(scoring_function_or.value());
evaluation.set_threshold_score(scoring_function->EvaluateMetric(metric_value));
}
return evaluation;
}

void MetricsEvaluatorImpl::ExtractMetricSpecs(
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
std::vector<const nighthawk::adaptive_load::MetricSpec*>& metric_specs,
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*,
const nighthawk::adaptive_load::ThresholdSpec*>&
threshold_spec_from_metric_spec) const {
for (const nighthawk::adaptive_load::MetricSpecWithThreshold& metric_threshold :
spec.metric_thresholds()) {
metric_specs.push_back(&metric_threshold.metric_spec());
threshold_spec_from_metric_spec[&metric_threshold.metric_spec()] =
&metric_threshold.threshold_spec();
}
for (const nighthawk::adaptive_load::MetricSpec& metric_spec :
spec.informational_metric_specs()) {
metric_specs.push_back(&metric_spec);
threshold_spec_from_metric_spec[&metric_spec] = nullptr;
}
}

absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult>
MetricsEvaluatorImpl::AnalyzeNighthawkBenchmark(
const nighthawk::client::ExecutionResponse& nighthawk_response,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
const absl::flat_hash_map<std::string, MetricsPluginPtr>& name_to_custom_metrics_plugin_map)
const {
if (nighthawk_response.error_detail().code() != static_cast<int>(absl::StatusCode::kOk)) {
return absl::Status(static_cast<absl::StatusCode>(nighthawk_response.error_detail().code()),
nighthawk_response.error_detail().message());
}

nighthawk::adaptive_load::BenchmarkResult benchmark_result;
*benchmark_result.mutable_nighthawk_service_output() = nighthawk_response.output();

// A map containing all available MetricsPlugins: preloaded custom plugins shared across all
// benchmarks, and a freshly instantiated builtin plugin for this benchmark only.
absl::flat_hash_map<std::string, MetricsPlugin*> name_to_plugin_map;
for (const auto& name_plugin_pair : name_to_custom_metrics_plugin_map) {
name_to_plugin_map[name_plugin_pair.first] = name_plugin_pair.second.get();
}
auto builtin_plugin =
std::make_unique<NighthawkStatsEmulatedMetricsPlugin>(nighthawk_response.output());
name_to_plugin_map["nighthawk.builtin"] = builtin_plugin.get();

// MetricSpecs in original order of definition.
std::vector<const nighthawk::adaptive_load::MetricSpec*> metric_specs;
// Pointer to the corresponding ThresholdSpec, or nullptr for informational metrics.
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*,
const nighthawk::adaptive_load::ThresholdSpec*>
threshold_spec_from_metric_spec;
ExtractMetricSpecs(spec, metric_specs, threshold_spec_from_metric_spec);

std::vector<std::string> errors;
for (const nighthawk::adaptive_load::MetricSpec* metric_spec : metric_specs) {
absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation> evaluation_or =
EvaluateMetric(*metric_spec, *name_to_plugin_map[metric_spec->metrics_plugin_name()],
threshold_spec_from_metric_spec[metric_spec]);
if (!evaluation_or.ok()) {
errors.emplace_back(absl::StrCat("Error evaluating metric: ", evaluation_or.status().code(),
": ", evaluation_or.status().message()));
continue;
}
*benchmark_result.mutable_metric_evaluations()->Add() = evaluation_or.value();
}
if (!errors.empty()) {
return absl::InternalError(absl::StrJoin(errors, "\n"));
}
return benchmark_result;
}

} // namespace Nighthawk
25 changes: 25 additions & 0 deletions source/adaptive_load/metrics_evaluator_impl.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#include "nighthawk/adaptive_load/metrics_evaluator.h"

namespace Nighthawk {

class MetricsEvaluatorImpl : public MetricsEvaluator {
public:
absl::StatusOr<nighthawk::adaptive_load::MetricEvaluation>
EvaluateMetric(const nighthawk::adaptive_load::MetricSpec& metric_spec,
MetricsPlugin& metrics_plugin,
const nighthawk::adaptive_load::ThresholdSpec* threshold_spec) const override;

void ExtractMetricSpecs(const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
std::vector<const nighthawk::adaptive_load::MetricSpec*>& metric_specs,
absl::flat_hash_map<const nighthawk::adaptive_load::MetricSpec*,
const nighthawk::adaptive_load::ThresholdSpec*>&
threshold_spec_from_metric_spec) const override;

absl::StatusOr<nighthawk::adaptive_load::BenchmarkResult>
AnalyzeNighthawkBenchmark(const nighthawk::client::ExecutionResponse& nighthawk_response,
const nighthawk::adaptive_load::AdaptiveLoadSessionSpec& spec,
const absl::flat_hash_map<std::string, MetricsPluginPtr>&
name_to_custom_metrics_plugin_map) const override;
};

} // namespace Nighthawk
12 changes: 12 additions & 0 deletions test/adaptive_load/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,18 @@ envoy_cc_test(
],
)

envoy_cc_test(
name = "metrics_evaluator_test",
srcs = ["metrics_evaluator_test.cc"],
repository = "@envoy",
deps = [
":minimal_output",
"//source/adaptive_load:metrics_evaluator_impl",
"//source/adaptive_load:scoring_function_impl",
"//test/adaptive_load/fake_plugins/fake_metrics_plugin",
],
)

envoy_cc_test(
name = "metrics_plugin_test",
srcs = ["metrics_plugin_test.cc"],
Expand Down
Loading