Skip to content

Commit

Permalink
Merge pull request #625 from klinch0/feature/add-kafka-monitoring
Browse files Browse the repository at this point in the history
feature/add-kafka-monitoring
  • Loading branch information
lllamnyp authored Feb 13, 2025
2 parents 65036e8 + cc0222a commit ecfb02a
Show file tree
Hide file tree
Showing 27 changed files with 6,000 additions and 1,092 deletions.
2,940 changes: 2,940 additions & 0 deletions dashboards/kafka/strimzi-kafka.json

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions hack/download-dashboards.sh
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//main/namespace/
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//main/capacity-planning/capacity-planning.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//flux/flux-control-plane.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//flux/flux-stats.json
modules/340-monitoring-kubernetes/monitoring/grafana-dashboards//kafka/strimzi-kafka.json
EOT


Expand Down
2 changes: 1 addition & 1 deletion packages/apps/kafka/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ type: application
# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.3.1
version: 0.3.2

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
Expand Down
12 changes: 12 additions & 0 deletions packages/apps/kafka/templates/kafka.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,12 @@ spec:
class: {{ . }}
{{- end }}
deleteClaim: true
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: {{ .Release.Name }}-metrics
key: kafka-metrics-config.yml
zookeeper:
replicas: {{ .Values.zookeeper.replicas }}
storage:
Expand All @@ -68,6 +74,12 @@ spec:
class: {{ . }}
{{- end }}
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: {{ .Release.Name }}-metrics
key: kafka-metrics-config.yml
entityOperator:
topicOperator: {}
userOperator: {}
Expand Down
198 changes: 198 additions & 0 deletions packages/apps/kafka/templates/metrics-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
kind: ConfigMap
apiVersion: v1
metadata:
name: {{ .Release.Name }}-metrics
data:
kafka-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# Special cases and very specific rules
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
topic: "$4"
partition: "$5"
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
broker: "$4:$5"
- pattern: kafka.server<type=(.+), cipher=(.+), protocol=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_tls_info
type: GAUGE
labels:
cipher: "$2"
protocol: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: kafka.server<type=(.+), clientSoftwareName=(.+), clientSoftwareVersion=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_software
type: GAUGE
labels:
clientSoftwareName: "$2"
clientSoftwareVersion: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: "kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+-total):"
name: kafka_server_$1_$4
type: COUNTER
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: "kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+):"
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+-total)
name: kafka_server_$1_$4
type: COUNTER
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+)
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
# Some percent metrics use MeanRate attribute
# Ex) kafka.server<type=(KafkaRequestHandlerPool), name=(RequestHandlerAvgIdlePercent)><>MeanRate
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>MeanRate
name: kafka_$1_$2_$3_percent
type: GAUGE
# Generic gauges for percents
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*, (.+)=(.+)><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
labels:
"$4": "$5"
# Generic per-second counters with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
# Generic gauges with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's.
# Note that these are missing the '_sum' metric!
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*), (.+)=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
quantile: "0.$8"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.*)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
quantile: "0.$6"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
quantile: "0.$4"
# KRaft overall related metrics
# distinguish between always increasing COUNTER (total and max) and variable GAUGE (all others) metrics
- pattern: "kafka.server<type=raft-metrics><>(.+-total|.+-max):"
name: kafka_server_raftmetrics_$1
type: COUNTER
- pattern: "kafka.server<type=raft-metrics><>(current-state): (.+)"
name: kafka_server_raftmetrics_$1
value: 1
type: UNTYPED
labels:
$1: "$2"
- pattern: "kafka.server<type=raft-metrics><>(.+):"
name: kafka_server_raftmetrics_$1
type: GAUGE
# KRaft "low level" channels related metrics
# distinguish between always increasing COUNTER (total and max) and variable GAUGE (all others) metrics
- pattern: "kafka.server<type=raft-channel-metrics><>(.+-total|.+-max):"
name: kafka_server_raftchannelmetrics_$1
type: COUNTER
- pattern: "kafka.server<type=raft-channel-metrics><>(.+):"
name: kafka_server_raftchannelmetrics_$1
type: GAUGE
# Broker metrics related to fetching metadata topic records in KRaft mode
- pattern: "kafka.server<type=broker-metadata-metrics><>(.+):"
name: kafka_server_brokermetadatametrics_$1
type: GAUGE
zookeeper-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# replicated Zookeeper
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+)><>(\\w+)"
name: "zookeeper_$2"
type: GAUGE
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+)><>(\\w+)"
name: "zookeeper_$3"
type: GAUGE
labels:
replicaId: "$2"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(Packets\\w+)"
name: "zookeeper_$4"
type: COUNTER
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+)><>(\\w+)"
name: "zookeeper_$4"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\\d+), name1=replica.(\\d+), name2=(\\w+), name3=(\\w+)><>(\\w+)"
name: "zookeeper_$4_$5"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"
40 changes: 40 additions & 0 deletions packages/apps/kafka/templates/podscrape.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: {{ .Release.Name }}
spec:
podMetricsEndpoints:
- port: tcp-prometheus
scheme: http
relabelConfigs:
- separator: ;
regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
replacement: $1
action: labelmap
- sourceLabels: [__meta_kubernetes_namespace]
separator: ;
regex: (.*)
targetLabel: namespace
replacement: $1
action: replace
- sourceLabels: [__meta_kubernetes_pod_name]
separator: ;
regex: (.*)
targetLabel: pod
replacement: $1
action: replace
- sourceLabels: [__meta_kubernetes_pod_node_name]
separator: ;
regex: (.*)
targetLabel: node
replacement: $1
action: replace
- sourceLabels: [__meta_kubernetes_pod_host_ip]
separator: ;
regex: (.*)
targetLabel: node_ip
replacement: $1
action: replace
selector:
matchLabels:
app.kubernetes.io/instance: {{ .Release.Name }}
3 changes: 2 additions & 1 deletion packages/apps/versions_map
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ kafka 0.2.1 3ac17018
kafka 0.2.2 d0758692
kafka 0.2.3 5ca8823
kafka 0.3.0 c07c4bbd
kafka 0.3.1 HEAD
kafka 0.3.1 b7375f73
kafka 0.3.2 HEAD
kubernetes 0.1.0 f642698
kubernetes 0.2.0 7cd7de73
kubernetes 0.3.0 7caccec1
Expand Down
1 change: 1 addition & 0 deletions packages/extra/monitoring/dashboards.list
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,4 @@ control-plane/kube-etcd
kubevirt/kubevirt-control-plane
flux/flux-control-plane
flux/flux-stats
kafka/strimzi-kafka
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
apiVersion: v2
appVersion: 0.43.0
appVersion: 0.45.0
description: 'Strimzi: Apache Kafka running on Kubernetes'
home: https://strimzi.io/
icon: https://raw.githubusercontent.com/strimzi/strimzi-kafka-operator/main/documentation/logo/strimzi_logo.png
Expand All @@ -24,4 +24,4 @@ maintainers:
name: strimzi-kafka-operator
sources:
- https://github.com/strimzi/strimzi-kafka-operator
version: 0.43.0
version: 0.45.0
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,15 @@ Strimzi provides a way to run an [Apache Kafka®](https://kafka.apache.org) clus
See our [website](https://strimzi.io) for more details about the project.

**!!! IMPORTANT !!!**
Upgrading to Strimzi 0.32 and newer directly from Strimzi 0.22 and earlier is no longer possible.
Please follow the [documentation](https://strimzi.io/docs/operators/latest/full/deploying.html#assembly-upgrade-str) for more details.

**!!! IMPORTANT !!!**
Strimzi 0.43.0 (and any of its patch releases) is the last Strimzi version with support for Kubernetes 1.23 and 1.24.
From Strimzi 0.44.0 on, Strimzi will support only Kubernetes 1.25 and newer.
* **Strimzi 0.45 is the last Strimzi version with support for ZooKeeper-based Apache Kafka clusters and MirrorMaker 1 deployments.**
**Please make sure to [migrate to KRaft](https://strimzi.io/docs/operators/latest/full/deploying.html#assembly-kraft-mode-str) and MirrorMaker 2 before upgrading to Strimzi 0.46 or newer.**
* Strimzi 0.45 is the last Strimzi version to include the [Strimzi EnvVar Configuration Provider](https://github.com/strimzi/kafka-env-var-config-provider) (deprecated in Strimzi 0.38.0) and [Strimzi MirrorMaker 2 Extensions](https://github.com/strimzi/mirror-maker-2-extensions) (deprecated in Strimzi 0.28.0).
Please use the Apache Kafka [EnvVarConfigProvider](https://github.com/strimzi/kafka-env-var-config-provider?tab=readme-ov-file#deprecation-notice) and [Identity Replication Policy](https://github.com/strimzi/mirror-maker-2-extensions?tab=readme-ov-file#identity-replication-policy) instead.
* From Strimzi 0.44.0 on, we support only Kubernetes 1.25 and newer.
Kubernetes 1.23 and 1.24 are not supported anymore.
* Upgrading to Strimzi 0.32 and newer directly from Strimzi 0.22 and earlier is no longer possible.
Please follow the [documentation](https://strimzi.io/docs/operators/latest/full/deploying.html#assembly-upgrade-str) for more details.

## Introduction

Expand All @@ -21,14 +24,16 @@ cluster using the [Helm](https://helm.sh) package manager.
### Supported Features

* **Manages the Kafka Cluster** - Deploys and manages all of the components of this complex application, including dependencies like Apache ZooKeeper® that are traditionally hard to administer.
* **KRaft support** - Allows running Apache Kafka clusters in the KRaft mode (without ZooKeeper).
* **KRaft support** - Allows running Apache Kafka clusters in the KRaft mode (without ZooKeeper).
* **Includes Kafka Connect** - Allows for configuration of common data sources and sinks to move data into and out of the Kafka cluster.
* **Topic Management** - Creates and manages Kafka Topics within the cluster.
* **User Management** - Creates and manages Kafka Users within the cluster.
* **Connector Management** - Creates and manages Kafka Connect connectors.
* **Includes Kafka Mirror Maker 1 and 2** - Allows for mirroring data between different Apache Kafka® clusters.
* **Includes Kafka MirrorMaker** - Allows for mirroring data between different Apache Kafka® clusters.
* **Includes HTTP Kafka Bridge** - Allows clients to send and receive messages through an Apache Kafka® cluster via the HTTP protocol.
* **Includes Cruise Control** - Automates the process of balancing partitions across an Apache Kafka® cluster.
* **Auto-rebalancing when scaling** - Automatically rebalance the Kafka cluster after a scale-up or before a scale-down.
* **Tiered storage** - Offloads older, less critical data to a lower-cost, lower-performance storage tier, such as object storage.
* **Prometheus monitoring** - Built-in support for monitoring using Prometheus.
* **Grafana Dashboards** - Built-in support for loading Grafana® dashboards via the grafana_sidecar

Expand Down Expand Up @@ -60,7 +65,7 @@ Strimzi is licensed under the [Apache License, Version 2.0](https://github.com/s

## Prerequisites

- Kubernetes 1.23+
- Kubernetes 1.25+

## Installing the Chart

Expand Down Expand Up @@ -97,7 +102,7 @@ the documentation for more details.
| `watchAnyNamespace` | Watch the whole Kubernetes cluster (all namespaces) | `false` |
| `defaultImageRegistry` | Default image registry for all the images | `quay.io` |
| `defaultImageRepository` | Default image registry for all the images | `strimzi` |
| `defaultImageTag` | Default image tag for all the images except Kafka Bridge | `0.43.0` |
| `defaultImageTag` | Default image tag for all the images except Kafka Bridge | `0.45.0` |
| `image.registry` | Override default Cluster Operator image registry | `nil` |
| `image.repository` | Override default Cluster Operator image repository | `nil` |
| `image.name` | Cluster Operator image name | `cluster-operator` |
Expand Down Expand Up @@ -161,7 +166,7 @@ the documentation for more details.
| `kafkaBridge.image.registry` | Override default Kafka Bridge image registry | `quay.io` |
| `kafkaBridge.image.repository` | Override default Kafka Bridge image repository | `strimzi` |
| `kafkaBridge.image.name` | Kafka Bridge image name | `kafka-bridge` |
| `kafkaBridge.image.tag` | Override default Kafka Bridge image tag | `0.30.0` |
| `kafkaBridge.image.tag` | Override default Kafka Bridge image tag | `0.31.1` |
| `kafkaBridge.image.digest` | Override Kafka Bridge image tag with digest | `nil` |
| `kafkaExporter.image.registry` | Override default Kafka Exporter image registry | `nil` |
| `kafkaExporter.image.repository` | Override default Kafka Exporter image repository | `nil` |
Expand Down
Loading

0 comments on commit ecfb02a

Please sign in to comment.