Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ATMOSPHERE-103] chore: Add loki rule to delect Nova cell down #495

Merged
merged 8 commits into from
Aug 31, 2024
37 changes: 37 additions & 0 deletions roles/loki/vars/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,18 @@ _loki_helm_values:
replication_factor: 1
limits_config:
max_label_names_per_series: 25
rulerConfig:
alertmanager_url: http://alertmanager-operated.monitoring:9093
enable_alertmanager_v2: true
enable_api: true
rule_path: /var/loki/rules-temp
ring:
kvstore:
store: inmemory
storage:
type: local
local:
directory: /var/loki/rulestorage
storage:
type: filesystem
schemaConfig:
Expand All @@ -45,6 +57,13 @@ _loki_helm_values:
openstack-control-plane: enabled
persistence:
size: 256Gi
extraVolumeMounts:
- name: rules
mountPath: /var/loki/rulestorage/fake
extraVolumes:
- name: rules
configMap:
name: loki-alerting-rules
write:
replicas: 0
read:
Expand All @@ -60,3 +79,21 @@ _loki_helm_values:
openstack-control-plane: enabled
lokiCanary:
enabled: false
extraObjects:
- apiVersion: v1
kind: ConfigMap
metadata:
name: loki-alerting-rules
labels:
loki_rule: "atmosphere"
data:
loki-alerting-rules.yaml: |-
groups:
- name: additional-loki-rules
rules:
- alert: NovaCellNotResponding
expr: 'count_over_time({pod_label_component="compute"} |= "not responding and hence is being omitted from the results" [1m]) > 0'
labels:
severity: critical
annotations:
summary: Nova Cell is not responding. It can cause port deletion in CAPI.