Skip to content

Commit

Permalink
stats: add histograms for request/response headers and body sizes (en…
Browse files Browse the repository at this point in the history
…voyproxy#11559)

Created a new struct for optional cluster stats. Moved timeout budget stats and added request response headers and body stats in the new struct.

Risk Level: Low
Testing: Added test cases
Docs Changes: added
Release Notes: added

Fixes envoyproxy#10308 , Fixes envoyproxy#3621

Signed-off-by: Ranjith Kumar <[email protected]>
Signed-off-by: scheler <[email protected]>
  • Loading branch information
ranjithkumar007 authored and scheler committed Aug 4, 2020
1 parent 956905b commit 9ff7f61
Show file tree
Hide file tree
Showing 18 changed files with 567 additions and 43 deletions.
26 changes: 24 additions & 2 deletions api/envoy/config/cluster/v3/cluster.proto
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ message ClusterCollection {
}

// Configuration for a single upstream cluster.
// [#next-free-field: 49]
// [#next-free-field: 50]
message Cluster {
option (udpa.annotations.versioning).previous_message_type = "envoy.api.v2.Cluster";

Expand Down Expand Up @@ -856,7 +856,12 @@ message Cluster {
// request. These show what percentage of a request's per try and global timeout was used. A value
// of 0 would indicate that none of the timeout was used or that the timeout was infinite. A value
// of 100 would indicate that the request took the entirety of the timeout given to it.
bool track_timeout_budgets = 47;
//
// .. attention::
//
// This field has been deprecated in favor of `timeout_budgets`, part of
// :ref:`track_cluster_stats <envoy_api_field_config.cluster.v3.Cluster.track_cluster_stats>`.
bool track_timeout_budgets = 47 [deprecated = true];

// Optional customization and configuration of upstream connection pool, and upstream type.
//
Expand All @@ -876,6 +881,9 @@ message Cluster {
// CONNECT only if a custom filter indicates it is appropriate, the custom factories
// can be registered and configured here.
core.v3.TypedExtensionConfig upstream_config = 48;

// Configuration to track optional cluster stats.
TrackClusterStats track_cluster_stats = 49;
}

// [#not-implemented-hide:] Extensible load balancing policy configuration.
Expand Down Expand Up @@ -936,3 +944,17 @@ message UpstreamConnectionOptions {
// If set then set SO_KEEPALIVE on the socket to enable TCP Keepalives.
core.v3.TcpKeepalive tcp_keepalive = 1;
}

message TrackClusterStats {
// If timeout_budgets is true, the :ref:`timeout budget histograms
// <config_cluster_manager_cluster_stats_timeout_budgets>` will be published for each
// request. These show what percentage of a request's per try and global timeout was used. A value
// of 0 would indicate that none of the timeout was used or that the timeout was infinite. A value
// of 100 would indicate that the request took the entirety of the timeout given to it.
bool timeout_budgets = 1;

// If request_response_sizes is true, then the :ref:`histograms
// <config_cluster_manager_cluster_stats_request_response_sizes>` tracking header and body sizes
// of requests and responses will be published.
bool request_response_sizes = 2;
}
33 changes: 23 additions & 10 deletions api/envoy/config/cluster/v4alpha/cluster.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions docs/root/configuration/upstream/cluster_manager/cluster_stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -314,3 +314,20 @@ Statistics for monitoring effective host weights when using the

min_entries_per_host, Gauge, Minimum number of entries for a single host
max_entries_per_host, Gauge, Maximum number of entries for a single host

.. _config_cluster_manager_cluster_stats_request_response_sizes:

Request Response Size statistics
--------------------------------

If :ref:`request response size statistics <envoy_v3_api_field_config.cluster.v3.Cluster.track_cluster_stats>` are tracked,
statistics will be added to *cluster.<name>* and contain the following:

.. csv-table::
:header: Name, Type, Description
:widths: 1, 1, 2

upstream_rq_headers_size, Histogram, Request headers size in bytes per upstream
upstream_rq_body_size, Histogram, Request body size in bytes per upstream
upstream_rs_headers_size, Histogram, Response headers size in bytes per upstream
upstream_rs_body_size, Histogram, Response body size in bytes per upstream
5 changes: 4 additions & 1 deletion docs/root/version_history/current.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,16 @@ Removed Config or Runtime

New Features
------------

* ext_authz filter: added support for emitting dynamic metadata for both :ref:`HTTP <config_http_filters_ext_authz_dynamic_metadata>` and :ref:`network <config_network_filters_ext_authz_dynamic_metadata>` filters.
* grpc-json: support specifying `response_body` field in for `google.api.HttpBody` message.
* http: introduced new HTTP/1 and HTTP/2 codec implementations that will remove the use of exceptions for control flow due to high risk factors and instead use error statuses. The old behavior is deprecated, but can be used during the removal period by setting the runtime feature `envoy.reloadable_features.new_codec_behavior` to false. The removal period will be one month.
* load balancer: added a :ref:`configuration<envoy_v3_api_msg_config.cluster.v3.Cluster.LeastRequestLbConfig>` option to specify the active request bias used by the least request load balancer.
* redis: added fault injection support :ref:`fault injection for redis proxy <envoy_v3_api_field_extensions.filters.network.redis_proxy.v3.RedisProxy.faults>`, described further in :ref:`configuration documentation <config_network_filters_redis_proxy>`.
* stats: added optional histograms to :ref:`cluster stats <config_cluster_manager_cluster_stats_request_response_sizes>`
that track headers and body sizes of requests and responses.
* tap: added :ref:`generic body matcher<envoy_v3_api_msg_config.tap.v3.HttpGenericBodyMatch>` to scan http requests and responses for text or hex patterns.

Deprecated
----------
* The :ref:`track_timeout_budgets <envoy_v3_api_field_config.cluster.v3.Cluster.track_timeout_budgets>`
field has been deprecated in favor of `timeout_budgets` part of an :ref:`Optional Configuration <envoy_v3_api_field_config.cluster.v3.Cluster.track_cluster_stats>`.
26 changes: 24 additions & 2 deletions generated_api_shadow/envoy/config/cluster/v3/cluster.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 27 additions & 2 deletions generated_api_shadow/envoy/config/cluster/v4alpha/cluster.proto

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

35 changes: 33 additions & 2 deletions include/envoy/upstream/upstream.h
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,15 @@ class PrioritySet {
REMAINING_GAUGE(remaining_retries, Accumulate) \
REMAINING_GAUGE(remaining_rq, Accumulate)

/**
* All stats tracking request/response headers and body sizes. Not used by default.
*/
#define ALL_CLUSTER_REQUEST_RESPONSE_SIZE_STATS(HISTOGRAM) \
HISTOGRAM(upstream_rq_headers_size, Bytes) \
HISTOGRAM(upstream_rq_body_size, Bytes) \
HISTOGRAM(upstream_rs_headers_size, Bytes) \
HISTOGRAM(upstream_rs_body_size, Bytes)

/**
* All stats around timeout budgets. Not used by default.
*/
Expand Down Expand Up @@ -650,13 +659,28 @@ struct ClusterCircuitBreakersStats {
ALL_CLUSTER_CIRCUIT_BREAKERS_STATS(GENERATE_GAUGE_STRUCT, GENERATE_GAUGE_STRUCT)
};

/**
* Struct definition for cluster timeout budget stats. @see stats_macros.h
*/
struct ClusterRequestResponseSizeStats {
ALL_CLUSTER_REQUEST_RESPONSE_SIZE_STATS(GENERATE_HISTOGRAM_STRUCT)
};

using ClusterRequestResponseSizeStatsPtr = std::unique_ptr<ClusterRequestResponseSizeStats>;
using ClusterRequestResponseSizeStatsOptRef =
absl::optional<std::reference_wrapper<ClusterRequestResponseSizeStats>>;

/**
* Struct definition for cluster timeout budget stats. @see stats_macros.h
*/
struct ClusterTimeoutBudgetStats {
ALL_CLUSTER_TIMEOUT_BUDGET_STATS(GENERATE_HISTOGRAM_STRUCT)
};

using ClusterTimeoutBudgetStatsPtr = std::unique_ptr<ClusterTimeoutBudgetStats>;
using ClusterTimeoutBudgetStatsOptRef =
absl::optional<std::reference_wrapper<ClusterTimeoutBudgetStats>>;

/**
* All extension protocol specific options returned by the method at
* NamedNetworkFilterConfigFactory::createProtocolOptions
Expand Down Expand Up @@ -851,9 +875,16 @@ class ClusterInfo {
virtual ClusterLoadReportStats& loadReportStats() const PURE;

/**
* @return absl::optional<ClusterTimeoutBudgetStats>& stats on timeout budgets for this cluster.
* @return absl::optional<std::reference_wrapper<ClusterRequestResponseSizeStats>> stats to track
* headers/body sizes of request/response for this cluster.
*/
virtual ClusterRequestResponseSizeStatsOptRef requestResponseSizeStats() const PURE;

/**
* @return absl::optional<std::reference_wrapper<ClusterTimeoutBudgetStats>> stats on timeout
* budgets for this cluster.
*/
virtual const absl::optional<ClusterTimeoutBudgetStats>& timeoutBudgetStats() const PURE;
virtual ClusterTimeoutBudgetStatsOptRef timeoutBudgetStats() const PURE;

/**
* Returns an optional source address for upstream connections to bind to.
Expand Down
32 changes: 32 additions & 0 deletions source/common/http/conn_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -607,6 +607,17 @@ ConnectionManagerImpl::ActiveStream::ActiveStream(ConnectionManagerImpl& connect

ConnectionManagerImpl::ActiveStream::~ActiveStream() {
stream_info_.onRequestComplete();
Upstream::HostDescriptionConstSharedPtr upstream_host =
connection_manager_.read_callbacks_->upstreamHost();

if (upstream_host != nullptr) {
Upstream::ClusterRequestResponseSizeStatsOptRef req_resp_stats =
upstream_host->cluster().requestResponseSizeStats();
if (req_resp_stats.has_value()) {
req_resp_stats->get().upstream_rq_body_size_.recordValue(stream_info_.bytesReceived());
req_resp_stats->get().upstream_rs_body_size_.recordValue(stream_info_.bytesSent());
}
}

// A downstream disconnect can be identified for HTTP requests when the upstream returns with a 0
// response code and when no other response flags are set.
Expand Down Expand Up @@ -722,6 +733,17 @@ void ConnectionManagerImpl::ActiveStream::chargeStats(const ResponseHeaderMap& h
return;
}

Upstream::HostDescriptionConstSharedPtr upstream_host =
connection_manager_.read_callbacks_->upstreamHost();

if (upstream_host != nullptr) {
Upstream::ClusterRequestResponseSizeStatsOptRef req_resp_stats =
upstream_host->cluster().requestResponseSizeStats();
if (req_resp_stats.has_value()) {
req_resp_stats->get().upstream_rs_headers_size_.recordValue(headers.byteSize());
}
}

connection_manager_.stats_.named_.downstream_rq_completed_.inc();
connection_manager_.listener_stats_.downstream_rq_completed_.inc();
if (CodeUtility::is1xx(response_code)) {
Expand Down Expand Up @@ -769,6 +791,16 @@ void ConnectionManagerImpl::ActiveStream::decodeHeaders(RequestHeaderMapPtr&& he
ScopeTrackerScopeState scope(this,
connection_manager_.read_callbacks_->connection().dispatcher());
request_headers_ = std::move(headers);
Upstream::HostDescriptionConstSharedPtr upstream_host =
connection_manager_.read_callbacks_->upstreamHost();

if (upstream_host != nullptr) {
Upstream::ClusterRequestResponseSizeStatsOptRef req_resp_stats =
upstream_host->cluster().requestResponseSizeStats();
if (req_resp_stats.has_value()) {
req_resp_stats->get().upstream_rq_headers_size_.recordValue(request_headers_->byteSize());
}
}

// Both saw_connection_close_ and is_head_request_ affect local replies: set
// them as early as possible.
Expand Down
Loading

0 comments on commit 9ff7f61

Please sign in to comment.