You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
level.Error(a.logger).Log("msg", "error on set alert", "err", err)
continue
}
a.mtx.Lock()
for_, l:=rangea.listeners {
select {
casel.alerts<-alert:
case<-l.done:
}
}
a.mtx.Unlock()
}
returnnil
}
Every set locks the internal store, which can become a performance issue (cf. #1201). Prometheus sends messages in batches of 50, but there's no enforcing of this on the alertmanager end. Anecdotally, I've seen a single alertmanager become unresponsive when receiving ~50 ingest requests per second, with each request containing a single message.
Batching at the store level is probably the right place to implement this:
Currently, every alert received via the API is sent directly to the internal memory store for alerts:
API:
alertmanager/api/v2/api.go
Line 348 in 8ca1f66
Memory store:
alertmanager/provider/mem/mem.go
Lines 149 to 180 in 8ca1f66
Every set locks the internal store, which can become a performance issue (cf. #1201). Prometheus sends messages in batches of 50, but there's no enforcing of this on the alertmanager end. Anecdotally, I've seen a single alertmanager become unresponsive when receiving ~50 ingest requests per second, with each request containing a single message.
Batching at the
store
level is probably the right place to implement this:alertmanager/store/store.go
Line 95 in 8ca1f66
Adding a basic benchmark and a receive queue would be a good first step.
This is more or less inspired by common sense and being reminded of how kafka ingests messages.
The text was updated successfully, but these errors were encountered: