-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the use of metric.reporters
in OAuth metrics, use strimzi.metric.reporters
instead
#193
Changes from 1 commit
39030c5
9827715
dede670
db294ff
611c52c
78969ed
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1225,21 +1225,38 @@ Configuring the metrics | |
|
||
By default, the gathering and exporting of metrics is disabled. Metrics are available to get an insight into the performance and failures during token validation, authorization operations and client authentication to the authorization server. You can also monitor the authorization server requests by background services such as refreshing of JWKS keys and refreshing of grants when `KeycloakAuthorizer` is used. | ||
|
||
You can enable metrics for token validation on the Kafka broker or for client authentication on the client by setting the following JAAS option to `true`: | ||
You can enable metrics for token validation, and keycloak authorization on the Kafka broker or for client authentication on the client by setting the following JAAS option to `true`: | ||
- `oauth.enable.metrics` (e.g.: "true") | ||
|
||
You can enable metrics for `KeycloakAuthorizer` by setting an analogous option in Kafka broker's `server.properties` file: | ||
You can also enable metrics only for keycloak authorization by setting an analogous option in Kafka broker's `server.properties` file: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above ... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed it to: |
||
- `strimzi.authorization.enable.metrics` (e.g.: "true") | ||
|
||
If `OAUTH_ENABLE_METRICS` env variable is set or if `oauth.enable.metrics` system property is set, that will both also enable the metrics for `KeycloakAuthorizer`. | ||
If `OAUTH_ENABLE_METRICS` env variable is set or if `oauth.enable.metrics` system property is set, that will also enable the metrics for keycloak authorization (as well as for token validation, and client authentication). | ||
|
||
If `oauth.config.id` is specified in JAAS configuration of the listener or the client, it will be available in MBean / metric name as `contextId` attribute. If not specified, it will be calculated from JAAS configuration for the validator or default to `client` in client JAAS config, or `keycloak-authorizer` for KeycloakAuthorizer metrics. | ||
The OAuth metrics ignores the Kafka `metric.reporters` option in order to prevent automatically instantiating double instances of reporters. Most reporters may expect that they are singleton object and may not function properly in multiple copies. | ||
Instead, there is `strimzi.metric.reporters` option where the reporters that support multiple copies can be specified for the purpose of metrics integration: | ||
- `strimzi.metric.reporters` (e.g.: "org.apache.kafka.common.metrics.JmxReporter", use ',' as a separator to enable multiple reporters) | ||
|
||
The configuration option `strimzi.metric.reporters` has to be configured as an env variable or a system property. Using it on the Kafka broker inside `server.properties` does not work reliably due to multiple pluggability mechanisms that can be used (authorizer, authentication callback handler, client). | ||
Some of these mechanisms get the `server.properties` filtered so only configuration recognised by Kafka makes it through. However, this is a global OAuth Metrics configuration, and it is initialized on first use by any of the components, using the configuration provided to that component. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not a comment about this PR, really an observation about Kafka being inconsistent/the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought the "contract" was to always pass There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then @mstruk which plugin did you find where the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is the inter-broker communication client that gets initialized with the already parsed Configuration object, which then no longer contains Map<String, String> but Map<String, ?> and the unrecognised config options are missing. |
||
Specifically, the inter-broker client using OAUTHBEARER might be the first to trigger OAuth Metrics initialisation on the broker, and does not see this config option. | ||
|
||
In order to reliably configure `strimzi.metric.reporters` one of the following options should be used when starting a Kafka broker: | ||
- `STRIMZI_METRIC_REPORTERS` env variable | ||
- `strimzi.metric.reporters` env variable or system property | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a lot of verbose description which isn't completely easy to understand, but which people would need to follow in order to make use of this feature. I don't see anywhere actual examples they they could copy and paste and expect to work. Could we add some examples somewhere? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are different ways to configure it. The simplest one would be:
I wouldn't go more complex with it. Let people read the docs. If the docs are not clear enough we'll improve. You can have more complex config for different contexts. For example for the Kafka client you can have:
Start client:
But it could also simply be:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added examples to README. |
||
When OAuth metrics are enabled the OAuth layer can not reuse existing metric reporter instances, and have to create its own copies of metric reporters. Noteably, it does not create its copy of `org.apache.kafka.common.metrics.JmxReporter` automatically. Instead, it has to be explicitly specified in `strimzi.metric.reporters` configuration as item in the list of reporters to instantiate. | ||
At the moment there is no way to integrate with the existing Kafka metrics / reporters objects already instantiated in different Kafka runtimes (producer, consumer, broker, ...). | ||
|
||
Metrics are exposed through JMX managed beans. They can also be exposed as Prometheus metrics by using the Prometheus JMX Exporter agent, and mapping the JMX metrics names to prometheus metrics names. | ||
NOTE: This breaks compatibility with OAuth versions preceding 0.13.0 where `metric.reporters` was used to configure reporters which were consequently automatically instantiated twice. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this compatibility break unavoidable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not breaking compatibility means to keep checking if 'metric.reporters' is set and if it is set instantiating the configured reporters. That is what happens when we configure CruiseControl metric reporter which results in exceptions at start time. We could say 'if |
||
|
||
The OAuth metrics also honor the `metric.reporters`, `metrics.num.samples`, `metrics.recording.level` and `metrics.sample.window.ms` configurations of Kafka runtimes. When OAuth metrics are enabled the OAuth layer creates duplicate instances of `JmxReporter` and the configured `MetricReporter`s since at the moment there is no other way to integrate with the existing metrics system of Kafka runtimes. | ||
|
||
When OAuth metrics are enabled, managed beans are registered on demand, containing the attributes that are easily translated into Prometheus metrics. | ||
The Kafka options that control sampling are honored: `metrics.num.samples`, `metrics.recording.level` and `metrics.sample.window.ms`. | ||
|
||
A common use-case is for metrics to be exposed through JMX managed beans. They can then also be exposed as Prometheus metrics by using the Prometheus JMX Exporter agent, and mapping the JMX metrics names to prometheus metrics names. | ||
If `oauth.config.id` is specified in JAAS configuration of the listener or the client, it will be available in MBean / metric name as `contextId` attribute. If not specified, it will be calculated from JAAS configuration for the validator or default to `client` in client JAAS config, or `keycloak-authorizer` for KeycloakAuthorizer metrics. | ||
|
||
When `JmxReporter` is enabled, managed beans are registered on demand, containing the attributes that are easily translated into Prometheus metrics. | ||
|
||
Each registered MBean contains two counter variables - `count`, and `totalTimeMs`. | ||
It also contains three gauge variables - `minTimeMs`, `maxTimeMs` and `avgTimeMs`. These are measured within the configured sample time window. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
/* | ||
* Copyright 2017-2023, Strimzi authors. | ||
* License: Apache License 2.0 (see the file LICENSE or http://apache.org/licenses/LICENSE-2.0.html). | ||
*/ | ||
package io.strimzi.kafka.oauth.metrics; | ||
|
||
import io.strimzi.kafka.oauth.common.Config; | ||
|
||
/** | ||
* Configuration that can be specified as ENV vars, System properties or in <code>server.properties</code> configuration file, | ||
* but not as part of the JAAS configuration. | ||
*/ | ||
public class GlobalConfig extends Config { | ||
|
||
/** The name of the 'strimzi.metric.reporters' config option */ | ||
public static final String STRIMZI_METRIC_REPORTERS = "strimzi.metric.reporters"; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,6 @@ | |
import org.apache.kafka.common.MetricName; | ||
import org.apache.kafka.common.config.AbstractConfig; | ||
import org.apache.kafka.common.config.ConfigDef; | ||
import org.apache.kafka.common.metrics.JmxReporter; | ||
import org.apache.kafka.common.metrics.KafkaMetricsContext; | ||
import org.apache.kafka.common.metrics.MetricConfig; | ||
import org.apache.kafka.common.metrics.Metrics; | ||
|
@@ -24,23 +23,40 @@ | |
import org.apache.kafka.common.utils.Time; | ||
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
import java.util.ArrayList; | ||
import java.util.Collections; | ||
import java.util.HashMap; | ||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.concurrent.ConcurrentHashMap; | ||
import java.util.concurrent.TimeUnit; | ||
import java.util.stream.Collectors; | ||
|
||
import static io.strimzi.kafka.oauth.metrics.GlobalConfig.STRIMZI_METRIC_REPORTERS; | ||
import static org.apache.kafka.clients.CommonClientConfigs.CLIENT_ID_CONFIG; | ||
import static org.apache.kafka.clients.CommonClientConfigs.METRICS_NUM_SAMPLES_CONFIG; | ||
import static org.apache.kafka.clients.CommonClientConfigs.METRICS_RECORDING_LEVEL_CONFIG; | ||
import static org.apache.kafka.clients.CommonClientConfigs.METRICS_SAMPLE_WINDOW_MS_CONFIG; | ||
|
||
/** | ||
* The singleton for handling a cache of all the Sensors to prevent unnecessary redundant re-registrations. | ||
* There is a one-to-one mapping between a SensorKey and a Sensor, and one-to-one mapping between a Sensor and an MBean name. | ||
* | ||
* MBeans are registered as requested by JmxReporter attached to the Metrics object. | ||
* There is a one-to-one mapping between a <code>SensorKey</code> and a <code>Sensor</code>, and one-to-one mapping between a <code>Sensor</code> and an <code>MBean</code> name. | ||
* <p> | ||
* MBeans are registered as requested by <code>JmxReporter</code> attached to the <code>Metrics</code> object. | ||
* The <code>JmxReporter</code> has to be explicitly configured using a config option <code>strimzi.metric.reporters</code>, which is | ||
* analogous to the Kafka <code>metric.reporters</code> configuration option. It is not configured by default. | ||
* <p> | ||
* Since OAuth instantiates its own <code>Metrics</code> object it also has to instantiate reporters to attach them to this <code>Metrics</code> object. | ||
* To prevent double instantiation of <code>MetricReporter</code> objects that require to be singleton, all <code>MetricReporter</code> objects | ||
* to be integrated with <code>OAuthMetrics</code> have to be separately instantiated. | ||
* <p> | ||
* Example: | ||
* <pre> | ||
* strimzi.metric.reporters=org.apache.kafka.common.metrics.JmxReporter | ||
* </pre> | ||
* Note: On the Kafka broker it is best to use <code>STRIMZI_METRIC_REPORTERS</code> env variable or <code>strimzi.metric.reporters</code> system property, | ||
* rather than a `server.properties` global configuration option. | ||
*/ | ||
public class OAuthMetrics { | ||
|
||
|
@@ -57,9 +73,14 @@ public class OAuthMetrics { | |
* | ||
* @param configMap Configuration properties | ||
*/ | ||
@SuppressWarnings("unchecked") | ||
OAuthMetrics(Map<String, ?> configMap) { | ||
this.configMap = configMap; | ||
this.config = new Config(configMap); | ||
|
||
// Make sure to add the resolved 'strimzi.metric.reporters' configuration to the config map | ||
((Map<String, Object>) configMap).put(STRIMZI_METRIC_REPORTERS, config.getValue(STRIMZI_METRIC_REPORTERS)); | ||
this.configMap = configMap; | ||
|
||
this.metrics = initKafkaMetrics(); | ||
} | ||
|
||
|
@@ -90,26 +111,48 @@ private Metrics initKafkaMetrics() { | |
|
||
private List<MetricsReporter> initReporters() { | ||
AbstractConfig kafkaConfig = initKafkaConfig(); | ||
List<MetricsReporter> reporters = kafkaConfig.getConfiguredInstances(CommonClientConfigs.METRIC_REPORTER_CLASSES_CONFIG, | ||
MetricsReporter.class); | ||
|
||
JmxReporter reporter = new JmxReporter(); | ||
reporter.configure(configMap); | ||
|
||
reporters.add(reporter); | ||
List<MetricsReporter> reporters = new ArrayList<>(3); | ||
mstruk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if (configMap.get(STRIMZI_METRIC_REPORTERS) != null) { | ||
reporters = kafkaConfig.getConfiguredInstances(STRIMZI_METRIC_REPORTERS, MetricsReporter.class); | ||
} else { | ||
log.warn("Configuration option '{}' is not set, OAuth metrics will not be exported to JMX", STRIMZI_METRIC_REPORTERS); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We define the config as low importance below and without a default value but if it's not set we emit a warning. Should we increase the importance or downgrade that log level? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know how you decide on the importance of the particular configuration option. In this case the warning is most welcome because it is almost certainly an omission that STRIMZI_METRIC_REPORTERS was not configured. With metrics enabled and not exported to JMX there is no way to collect metrics by any mechanism (as far as I know). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If not setting it generates warnings, it looks pretty important to me. That said I'm not sure what conventions are used in other parts of the Strimzi code so if none of the maintainers pick on this, maybe we can ignore it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would agree with Mickael that this seems a bit weird. I guess when you put it that enabling metrics but not configuring the metrics reporter would be wrong, that makes sense. But why does the user need to:
Why not do just one and have the other done automatically by the Oauth library? I guess I'm missing some obvious reason why it is not possible, but it would be good to understand it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For OAuth metrics I didn't want them to be enabled by default in order to avoid potential issues (performance or memory consumption for example). And that did allow us to very easily work around CruiseControl issue - everything was fine until someone enabled OAuth metrics. The solution was - don't enable OAuth metrics. OAuth metrics also has finer control on metrics collection. You can enable metrics globally, or for just particular listeners, or just for KeycloakAuthorizer. Kafka metrics are enabled by default and which of them are enabled is controlled by metrics.recording.level. But as far as I understand there is no way to completely turn them off. For OAuth we could change the default for the configuration (enabled by default and you can disable it), but if we otherwise keep backwards compatibility around I would prefer for now to NOT have OAuth metrics enabled by default. Let's have it opt-in until we're confident that there are no issues when using them (like the issue this PR is addressing of things crashing when a reporter not compatible with OAuth metrics is configured). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can understand why OAuth metrics should not be enabled by default. But why does the user need to configure the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, yes that's a great point. I was motivated by aligning with where Kafka's handling of We can opt to not go this way to avoid needing to set two configurations (Kafka does not have that problem since its metrics are always enabled). We could for example handle omission of That probably makes more sense in our case as then it is not necessary to always set it, and user should consult the documentation anyway to determine the behaviour and that it is not the same as for Kafka metrics configured via There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also keep in mind that hopefully that should all disappear in the near future once KIP-877 gets in Kafka. So maybe we want to keep things simple. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I'll modify the logic as described above. It means no backwards compatibility in the sense that |
||
} | ||
return reporters; | ||
} | ||
|
||
private AbstractConfig initKafkaConfig() { | ||
ConfigDef configDef = new ConfigDef() | ||
.define(CommonClientConfigs.METRIC_REPORTER_CLASSES_CONFIG, | ||
ConfigDef.Type.LIST, | ||
Collections.emptyList(), | ||
new ConfigDef.NonNullValidator(), | ||
ConfigDef.Importance.LOW, | ||
CommonClientConfigs.METRIC_REPORTER_CLASSES_DOC); | ||
|
||
return new AbstractConfig(configDef, configMap); | ||
ConfigDef configDef = addMetricReporterToConfigDef(new ConfigDef(), STRIMZI_METRIC_REPORTERS); | ||
return new AbstractConfig(configDef, toMapOfStringValues(configMap)); | ||
} | ||
|
||
private ConfigDef addMetricReporterToConfigDef(ConfigDef configDef, String name) { | ||
return configDef.define(name, | ||
ConfigDef.Type.LIST, | ||
Collections.emptyList(), | ||
new ConfigDef.NonNullValidator(), | ||
ConfigDef.Importance.LOW, | ||
CommonClientConfigs.METRIC_REPORTER_CLASSES_DOC); | ||
} | ||
|
||
private Map<String, String> toMapOfStringValues(Map<String, ?> configMap) { | ||
HashMap<String, String> result = new HashMap<>(); | ||
for (Map.Entry<String, ?> ent: configMap.entrySet()) { | ||
Object val = ent.getValue(); | ||
if (val == null) { | ||
continue; | ||
} | ||
if (val instanceof Class) { | ||
result.put(ent.getKey(), ((Class<?>) val).getCanonicalName()); | ||
mstruk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} else if (val instanceof List) { | ||
String stringVal = ((List<?>) val).stream().map(String::valueOf).collect(Collectors.joining(",")); | ||
if (!"".equals(stringVal)) { | ||
mstruk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
result.put(ent.getKey(), stringVal); | ||
} | ||
} else { | ||
result.put(ent.getKey(), String.valueOf(ent.getValue())); | ||
} | ||
} | ||
return result; | ||
} | ||
|
||
private KafkaMetricsContext createKafkaMetricsContext() { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,13 @@ public static synchronized void configure(Map<String, ?> configs) { | |
} | ||
} | ||
|
||
/** | ||
* Close any configured Services so they can be reinitialised again | ||
*/ | ||
public static synchronized void close() { | ||
services = null; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not really closing services There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe one day it will be. There are at the moment no threads running in there so I simply release the memory. You think this should be called differently or the docs should be different or you think the implementation should be different? |
||
} | ||
|
||
/** | ||
* Get a configured singleton instance | ||
* | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -93,4 +93,32 @@ public static void logStart(String msg) { | |
System.out.println("======== " + msg); | ||
System.out.println(); | ||
} | ||
|
||
public static int findFirstMatchingInLog(List<String> log, String regex) { | ||
int lineNum = 0; | ||
Pattern pattern = Pattern.compile(prepareRegex(regex)); | ||
for (String line: log) { | ||
if (pattern.matcher(line).matches()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems a bit weird to have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're absolutely right. That's a dumb code. I'll remove the redundant prepareRegex() and use find() |
||
return lineNum; | ||
} | ||
lineNum++; | ||
} | ||
return -1; | ||
} | ||
|
||
public static String prepareRegex(String regex) { | ||
String prefix = regex.startsWith("^") ? "" : ".*"; | ||
String suffix = regex.endsWith("$") ? "" : ".*"; | ||
return prefix + regex + suffix; | ||
} | ||
|
||
public static boolean checkLogForRegex(List<String> log, String regex) { | ||
Pattern pattern = Pattern.compile(prepareRegex(regex)); | ||
for (String line: log) { | ||
if (pattern.matcher(line).matches()) { | ||
return true; | ||
} | ||
} | ||
return false; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be Keycloak? Or should it be marked as a code to indicate some configuration option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to:
KeycloakAuthorizer