Helm chart #3

patrickmslatteryvt · 2019-08-18T23:28:30Z

Hi, this is an initial Helm chart for calert.
It implements a similar deployment to the one you have in the k8s folder but without Redis.

mr-karan · 2019-08-19T06:17:14Z

Thanks @patrickmslatteryvt I'll review this ASAP :)

MPV · 2020-04-20T13:26:30Z

This looks promising! What's next to help this moving forward? 😊

mr-karan · 2020-04-21T09:00:21Z

Can this be tested out locally? (without Tiller, ofcourse). If someone can help validating the chart and reviewing it, would be great! @MPV

MPV · 2020-04-21T09:38:06Z

I'm trying this out here now.

So far I've found 2 issues:

Hardcoded name of configmap in deployment.
Example prometheus-operator / alertmanager config creates errors.

1. Hardcoded configmap:

As I'm deploying this chart as a subchart as part of my "cluster-monitoring" chart, when the configmap is named after the release, I get cluster-monitoring-calert-config instead of just calert-config:

✅ The solution to this is just changing the referenced name of the configmap in the deployment:
https://github.com/patrickmslatteryvt/calert/blob/master/calert/templates/deployment.yaml#L58

-            name: calert-config
+            name: {{ template "calert.fullname" . }}-config

...as this is how the configmap is being named:
https://github.com/patrickmslatteryvt/calert/blob/master/calert/templates/configmap.yaml#L4

2. Broken alertmanager config

I get this error:
level=error ts=2020-04-21T09:15:12.828Z caller=coordinator.go:124 component=configuration msg="Loading configuration file failed" file=/etc/alertmanager/config/alertmanager.yaml err="undefined receiver \"null\" used in route"

And here is the rendered alertmanager config, after using the supplied example:
$ kubectl -n cluster-monitoring get secret alertmanager-cluster-monitoring-prometh-alertmanager -o yaml | yq r - "data[alertmanager.yaml]" | base64 --decode

global:
  resolve_timeout: 5m
receivers:
- name: google-chat
  webhook_configs:
  - url: http://calert.cluster-monitoring.svc.cluster.local:6000/create?room_name=my-room-is-here
route:
  group_by:
  - job
  group_interval: 5m
  group_wait: 30s
  receiver: "null" # <--- could this be giving the issue?
  repeat_interval: 12h
  routes:
  - group_by:
    - alertname
    match:
      severity: critical
    receiver: google-chat

This is probably due to the fact that the prometheus-operator's default config for alertmanager includes a route with the "null" recevier (and also such a receiver):
https://github.com/helm/charts/blob/1e6a30c/stable/prometheus-operator/values.yaml#L131

❓ I'm not yet sure how to solve this. Could be that we should just add that "null" receiver to the receivers block, or specify google-chat as the default receiver for the route. I'll try it out.

MPV · 2020-04-21T10:14:00Z

2. Fixed example Alertmanager config

Here's how I got it working by adapting the example alertmanager config:
https://github.com/patrickmslatteryvt/calert/blob/master/calert/README.md#alertmanager-configuration

alertmanager:
  config:
    route:
+     receiver: "null"
      routes:
      - match:
          severity: critical
        receiver: google-chat
        group_by: [alertname]
    receivers:
+   - name: "null"
    - name: 'google-chat'
      webhook_configs:
      - url: "http://calert.clu-inf-all.svc.cluster.local:6000/create?room_name=<room>"

The second change adds back the receiver that was missing from the example, while the first change just makes it explicitly used (as it would still be set like this while merging with the prometheus-operator default values).

MPV · 2020-04-21T10:25:21Z

I can confirm that I got it working with my changes above. 🎉

MPV · 2020-04-21T10:25:28Z

I've opened PRs to @patrickmslatteryvt's fork with the fixes:
patrickmslatteryvt#1
patrickmslatteryvt#2

MPV · 2020-05-05T13:10:44Z

@mr-karan ...but my changes above aren't any blockers for using this, so I would consider this PR tested and okay to merge, unless you have any other thoughts? If @patrickmslatteryvt isn't available at the moment, I could re-contribute those changes into this repo after merging this PR.

mr-karan · 2020-05-05T13:13:05Z

Thanks for your contribution @MPV. Much appreciated.

pslattery added 3 commits August 18, 2019 18:29

Helm chart for calert

30e8ab0

Readme for the calert chart

b3fa172

Alertmanager configuration

21127cf

This was referenced Apr 21, 2020

Add helm release name of configmap in deployment patrickmslatteryvt/calert#1

Closed

Add back null receiver to alertmanager example config patrickmslatteryvt/calert#2

Closed

mr-karan marked this pull request as ready for review May 5, 2020 13:12

mr-karan merged commit 19e30bd into mr-karan:master May 5, 2020

This was referenced May 6, 2020

Helm chart: Add helm release name of configmap in deployment #13

Merged

Helm chart: Add back null receiver to alertmanager example config #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Helm chart #3

Helm chart #3

patrickmslatteryvt commented Aug 18, 2019

mr-karan commented Aug 19, 2019

MPV commented Apr 20, 2020

mr-karan commented Apr 21, 2020

MPV commented Apr 21, 2020 •

edited

Loading

MPV commented Apr 21, 2020

MPV commented Apr 21, 2020

MPV commented Apr 21, 2020

MPV commented May 5, 2020

mr-karan commented May 5, 2020

Helm chart #3

Helm chart #3

Conversation

patrickmslatteryvt commented Aug 18, 2019

mr-karan commented Aug 19, 2019

MPV commented Apr 20, 2020

mr-karan commented Apr 21, 2020

MPV commented Apr 21, 2020 • edited Loading

1. Hardcoded configmap:

2. Broken alertmanager config

MPV commented Apr 21, 2020

2. Fixed example Alertmanager config

MPV commented Apr 21, 2020

MPV commented Apr 21, 2020

MPV commented May 5, 2020

mr-karan commented May 5, 2020

MPV commented Apr 21, 2020 •

edited

Loading