Skip to content

Commit 3b0a224

Browse files
WanzenBugJoelColledge
authored andcommitted
monitoring: do not fire alerts for transitional states immediately
Some "exceptional" states are actually expected during resource creation/deletion. So give those states a short while to be ironed out by normal LINSTOR operations. Signed-off-by: Moritz Wanzenböck <[email protected]>
1 parent bb61538 commit 3b0a224

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

config/extras/monitoring/alerts.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -66,13 +66,15 @@ spec:
6666
description: |
6767
DRBD Resource "{{ $labels.name }}" on "{{ $labels.node }}" is not connected to "{{ $labels.conn_name }}": {{ $labels.drbd_connection_state }}.
6868
expr: drbd_connection_state{drbd_connection_state!="Connected"} > 0
69+
for: 1m
6970
labels:
7071
severity: warn
7172
- alert: drbdDeviceNotUpToDate
7273
annotations:
7374
description: |
7475
DRBD device "{{ $labels.name }}" on "{{ $labels.node }}" has unexpected device state "{{ $labels.drbd_device_state }}".
7576
expr: drbd_device_state{drbd_device_state!~"UpToDate|Diskless"} > 0
77+
for: 1m
7678
labels:
7779
severity: warn
7880
- alert: drbdDeviceUnintentionalDiskless
@@ -89,6 +91,7 @@ spec:
8991
DRBD device "{{ $labels.name }}" on "{{ $labels.node }}" has no quorum.
9092
This usually indicates connectivity issues.
9193
expr: drbd_device_quorum == 0
94+
for: 1m
9295
labels:
9396
severity: warn
9497
- alert: drbdResourceSuspended
@@ -112,5 +115,6 @@ spec:
112115
description: |
113116
DRBD resource "{{ $labels.name }}" has no UpToDate replicas.
114117
expr: sum by (name) (drbd_device_state{drbd_device_state="UpToDate"}) == 0
118+
for: 1m
115119
labels:
116120
severity: critical

0 commit comments

Comments
 (0)