Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use scrapeconfig instead of servicemonitor #607

Merged
merged 5 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ loaded for you.
* smartgatewayCollectdEventsManifest
* smartgatewayCeilometerEventsManifest
* servicemonitorManifest
* scrapeconfigManifest

## Development

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,15 @@ spec:
- kind: ServiceMonitors
name: servicemonitors.monitoring.coreos.com
version: v1
- kind: ScrapeConfigs
name: scrapeconfigs.monitoring.coreos.com
version: v1alpha1
- kind: ServiceMonitors
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missing previously, and I'm not 100% sure the effects. My guess is it would affected garbage collection in the operator-sdk, preventing these servicemonitors from being cleaned up automatically when the servicetelemetry object is deleted.

name: servicemonitors.monitoring.rhobs
version: v1
- kind: ScrapeConfigs
name: scrapeconfigs.monitoring.rhobs
version: v1alpha1
version: v1beta1
description: Service Telemetry Operator for monitoring clouds
displayName: Service Telemetry Operator
Expand Down Expand Up @@ -378,17 +387,21 @@ spec:
- apiGroups:
- monitoring.coreos.com
resources:
- scrapeconfigs
- servicemonitors
verbs:
- get
- create
- delete
- apiGroups:
- monitoring.rhobs
resources:
- scrapeconfigs
- servicemonitors
verbs:
- get
- create
- delete
- apiGroups:
- apps
resourceNames:
Expand Down
4 changes: 4 additions & 0 deletions deploy/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -129,17 +129,21 @@ rules:
- apiGroups:
- monitoring.coreos.com
resources:
- scrapeconfigs
- servicemonitors
verbs:
- get
- create
- delete
- apiGroups:
- monitoring.rhobs
resources:
- scrapeconfigs
- servicemonitors
verbs:
- get
- create
- delete
- apiGroups:
- apps
resourceNames:
Expand Down
4 changes: 2 additions & 2 deletions roles/servicetelemetry/tasks/base_smartgateway.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
k8s:
definition: "{{ lookup('template', manifest) | from_yaml }}"

- name: Deploy SG-specific ServiceMonitor for metrics SGs
include_tasks: component_servicemonitor.yml
- name: Deploy SG-specific ScrapeConfig for metrics SGs
include_tasks: component_scrapeconfig.yml
when:
- data_type == 'metrics'
- has_monitoring_api | bool
Expand Down
87 changes: 87 additions & 0 deletions roles/servicetelemetry/tasks/component_scrapeconfig.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
- name: Look up prometheus-stf SA to get auth secret name
k8s_info:
api_version: v1
kind: ServiceAccount
namespace: '{{ ansible_operator_meta.namespace }}'
name: prometheus-stf
register: service_account

- name: Look up auth secret to get token secret name
k8s_info:
api_version: v1
kind: Secret
namespace: '{{ ansible_operator_meta.namespace }}'
name: '{{ service_account.resources[0].secrets[0].name }}'
register: auth_secret

- name: Create SG-specific Service Monitor manifest
set_fact:
sg_specific_scrapeconfig_manifest: |
apiVersion: {{ prometheus_operator_api_string | replace("/v1","/v1alpha1") }}
kind: ScrapeConfig
metadata:
labels:
app: smart-gateway
name: '{{ this_smartgateway }}'
namespace: '{{ ansible_operator_meta.namespace }}'
spec:
authorization:
type: bearer
credentials:
name: '{{ auth_secret.resources[0].metadata.annotations['openshift.io/token-secret.name'] }}'
key: token
metricRelabelings:
- action: labeldrop
regex: pod
sourcelabels: []
- action: labeldrop
regex: namespace
sourcelabels: []
- action: labeldrop
regex: instance
sourcelabels: []
- action: labeldrop
regex: job
sourcelabels: []
- action: labeldrop
regex: publisher
sourcelabels: []
scheme: HTTPS
scrapeInterval: {{ servicetelemetry_vars.backends.metrics.prometheus.scrape_interval }}
staticConfigs:
- targets:
- '{{ this_smartgateway }}.{{ ansible_operator_meta.namespace }}.svc:8083'
tlsConfig:
ca:
configMap:
name: serving-certs-ca-bundle
key: service-ca.crt
serverName: '{{ this_smartgateway }}.{{ ansible_operator_meta.namespace }}.svc'

- name: Create ScrapeConfig to scrape Smart Gateway
k8s:
state: '{{ "present" if servicetelemetry_vars.backends.metrics.prometheus.enabled else "absent" }}'
definition:
'{{ sg_specific_scrapeconfig_manifest }}'

- name: Create additional ScrapeConfig if provided
k8s:
state: '{{ "present" if servicetelemetry_vars.backends.metrics.prometheus.enabled else "absent" }}'
definition:
'{{ scrapeconfig_manifest }}'
when: scrapeconfig_manifest is defined

- name: Create additional ServiceMonitor if provided (legacy)
k8s:
state: '{{ "present" if servicetelemetry_vars.backends.metrics.prometheus.enabled else "absent" }}'
definition:
'{{ servicemonitor_manifest }}'
when: servicemonitor_manifest is defined

- name: Remove (legacy) default ServiceMonitors
k8s:
state: absent
api_version: '{{ prometheus_operator_api_string }}'
kind: ServiceMonitor
namespace: '{{ ansible_operator_meta.namespace }}'
name: '{{ this_smartgateway }}'
52 changes: 0 additions & 52 deletions roles/servicetelemetry/tasks/component_servicemonitor.yml

This file was deleted.

2 changes: 1 addition & 1 deletion roles/servicetelemetry/templates/manifest_alertmanager.j2
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ spec:
{% endif %}
replicas: {{ servicetelemetry_vars.alerting.alertmanager.deployment_size }}
serviceAccountName: alertmanager-stf
serviceMonitorSelector:
scrapeConfigSelector:
matchLabels:
app: smart-gateway
listenLocal: true
Expand Down
2 changes: 1 addition & 1 deletion roles/servicetelemetry/templates/manifest_prometheus.j2
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ spec:
ruleSelector: {}
securityContext: {}
serviceAccountName: prometheus-stf
serviceMonitorSelector:
scrapeConfigSelector:
matchLabels:
app: smart-gateway
listenLocal: true
Expand Down
4 changes: 2 additions & 2 deletions tests/smoketest/smoketest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ echo "*** [INFO] Showing oc get all..."
oc get all
echo

echo "*** [INFO] Showing servicemonitors..."
oc get servicemonitors.monitoring.rhobs -o yaml
echo "*** [INFO] Showing scrapeconfigs..."
oc get scrapeconfigs.monitoring.rhobs -o yaml
echo

if [ "$SMOKETEST_VERBOSE" = "true" ]; then
Expand Down