Skip to content
This repository was archived by the owner on Jan 16, 2025. It is now read-only.

Unit Test Rules with Prometheus in Gluster Mixins

Ankush Behl edited this page Jan 14, 2019 · 1 revision

Unit Test Rules with Prometheus in Gluster Mixins

Prometheus provides promtool, that has a Unit Testing feature to test Recording Rules and Alerts. This is based on the unit test that PromQL uses for unit test internally.

Let's take the following alert (in prometheus_alerts.yaml) as an example:

"groups":
- "name": "gluster-utilization"
  "rules":
  - "alert": "GlusterVolumeUtilization"
    "annotations":
      "message": "Gluster Volume {{$labels.volume}} Utilization more than 80%"
    "expr": |
      100 * gluster:volume_capacity_used_bytes_total:sum
          / gluster:volume_capacity_total_bytes:sum > 80
    "for": "5m"
    "labels":
      "severity": "warning"

To test this, we create a test.yml as:

rule_files:
  - prometheus_alerts.yaml

evaluation_interval: 1m

tests:
 - interval: 1m
   input_series:
    - series: 'gluster:volume_capacity_used_bytes_total:sum{job="glusterd2-client",volume="vol1"}'
      values: '100000000+0x5 1717986919+0x15 2040109465+0x10'
    - series: 'gluster:volume_capacity_total_bytes:sum{job="glusterd2-client",volume="vol1"}'
      values: '2147483648+0x30'
   alert_rule_test:
    - alertname: GlusterVolumeUtilization
      eval_time: 6m
    - alertname: GlusterVolumeUtilization
      eval_time: 11m
      exp_alerts:
       - exp_labels:
           severity: warning
           job: glusterd2-client
           volume: vol1
         exp_annotations:
           message: 'Gluster Volume vol1 Utilization more than 80%'

Run the test with,

$ promtool test rules test.yml

It should return,

Unit Testing: test.yml
  SUCCESS

The test data is provided in form of a time series,

input_series:
 - series: 'gluster:volume_capacity_used_bytes_total:sum{job="glusterd2-client",volume="vol1"}'
   values: '100000000+0x5 1717986919+0x15 2040109465+0x10'
 - series: 'gluster:volume_capacity_total_bytes:sum{job="glusterd2-client",volume="vol1"}'
   values: '2147483648+0x30'
  • series provides the metric name with labels as expected by PromQL.

  • values provides the Time Series Data.

    values: '100000000+0x5 1717986919+0x15 2040109465+0x10'

    This means,

    • for first 5m used bytes will be 100000000 bytes.
    • for the next 15m used bytes will be 1717986919 bytes.
    • for the next 10m used byets will be 2040109465 bytes.

The test itself is written as,

alert_rule_test:
 - alertname: GlusterVolumeUtilization
   eval_time: 6m
 - alertname: GlusterVolumeUtilization
   eval_time: 11m
   exp_alerts:
    - exp_labels:
        severity: warning
        job: glusterd2-client
        volume: vol1
      exp_annotations:
        message: 'Gluster Volume vol1 Utilization more than 80%'
  • alertname is the name of the alert to be tested.
  • eval_time is the time at which the presence of alert is to be tested.
  • exp_labels matches the labels received from ProlQL.
  • exp_annotations matches the annotations received from PromQL.

This means,

  • for the first test, at 6m in, there is no alert raised. This is because the utilization as provided my test data is less than 80%.
  • for the second test, at 11m in, it is expected to have an alert as the utilization now crosses 80%. The alert raised is matched against the labels and annotations provided.

If the test fails, the expected result and the obtained result is returned to stdout.

FAILED:
   alertname:GlusterBrickUtilization, time:11m0s,
       exp:"[Labels:{alertname=\"GlusterBrickUtilization\", brick_path=\"/host1/brick1\", host=\"host1\", job=\"glusterd2-client\", severity=\"warning\"} Annotations:{message=\"Gluster Brick host1:/host1/brick2 Utilization more than 80%\"}]",
       got:"[Labels:{alertname=\"GlusterBrickUtilization\", brick_path=\"/host1/brick1\", host=\"host1\", job=\"glusterd2-client\", severity=\"warning\"} Annotations:{message=\"Gluster Brick host1:/host1/brick1 Utilization more than 80%\"}]"
Clone this wiki locally