-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingester panics if partition is reassigned #1200
Comments
Looks like we need to change the counter that tracks actively used partitions to a gauge. And we might want to emit it more often than just once when starting to consume a partition, as otherwise a dashboard query may not find any samples from the time series. cc @vprithvi |
I ran into similar issue in #1253, temporary workaround is to use |
bobrik
added a commit
to bobrik/jaeger
that referenced
this issue
Apr 23, 2019
Counters cannot be decremented in Prometheus: ``` panic: counter cannot decrease in value goroutine 895 [running]: github.com/jaegertracing/jaeger/vendor/github.com/prometheus/client_golang/prometheus.(*counter).Add(0xc000790600, 0xbff0000000000000) /home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/prometheus/client_golang/prometheus/counter.go:71 +0xa3 github.com/jaegertracing/jaeger/vendor/github.com/uber/jaeger-lib/metrics/prometheus.(*counter).Inc(0xc0006b42a0, 0xffffffffffffffff) /home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/uber/jaeger-lib/metrics/prometheus/factory.go:183 +0x46 github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*Consumer).handleMessages(0xc0004c4300, 0xf08c60, 0xc00054e630) /home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/consumer.go:124 +0x893 created by github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*Consumer).Start.func1 /home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/consumer.go:87 +0xbd ``` Gauges can, even though we have to keep an extra variable around to keep count. In Prometheus Go library itself that is not necessary as Gauge type provides `Inc` and `Dec`, but Jaeger's wrapper does not have those exposed. Fixes jaegertracing#1200. Signed-off-by: Ivan Babrou <[email protected]>
bobrik
added a commit
to bobrik/jaeger
that referenced
this issue
Apr 23, 2019
Counters cannot be decremented in Prometheus: ``` panic: counter cannot decrease in value goroutine 895 [running]: github.com/jaegertracing/jaeger/vendor/github.com/prometheus/client_golang/prometheus.(*counter).Add(0xc000790600, 0xbff0000000000000) /home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/prometheus/client_golang/prometheus/counter.go:71 +0xa3 github.com/jaegertracing/jaeger/vendor/github.com/uber/jaeger-lib/metrics/prometheus.(*counter).Inc(0xc0006b42a0, 0xffffffffffffffff) /home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/uber/jaeger-lib/metrics/prometheus/factory.go:183 +0x46 github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*Consumer).handleMessages(0xc0004c4300, 0xf08c60, 0xc00054e630) /home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/consumer.go:124 +0x893 created by github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*Consumer).Start.func1 /home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/consumer.go:87 +0xbd ``` Gauges can, even though we have to keep an extra variable around to keep count. In Prometheus Go library itself that is not necessary as Gauge type provides `Inc` and `Dec`, but Jaeger's wrapper does not have those exposed. Fixes jaegertracing#1200. Signed-off-by: Ivan Babrou <[email protected]>
#1485 addresses the issue. |
yurishkuro
pushed a commit
that referenced
this issue
Apr 23, 2019
* Switch from counter to a gauge for partitions held Counters cannot be decremented in Prometheus: ``` panic: counter cannot decrease in value goroutine 895 [running]: github.com/jaegertracing/jaeger/vendor/github.com/prometheus/client_golang/prometheus.(*counter).Add(0xc000790600, 0xbff0000000000000) /home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/prometheus/client_golang/prometheus/counter.go:71 +0xa3 github.com/jaegertracing/jaeger/vendor/github.com/uber/jaeger-lib/metrics/prometheus.(*counter).Inc(0xc0006b42a0, 0xffffffffffffffff) /home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/uber/jaeger-lib/metrics/prometheus/factory.go:183 +0x46 github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*Consumer).handleMessages(0xc0004c4300, 0xf08c60, 0xc00054e630) /home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/consumer.go:124 +0x893 created by github.com/jaegertracing/jaeger/cmd/ingester/app/consumer.(*Consumer).Start.func1 /home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/ingester/app/consumer/consumer.go:87 +0xbd ``` Gauges can, even though we have to keep an extra variable around to keep count. In Prometheus Go library itself that is not necessary as Gauge type provides `Inc` and `Dec`, but Jaeger's wrapper does not have those exposed. Fixes #1200. Signed-off-by: Ivan Babrou <[email protected]> * Protect partitionsHeld in consumer by lock Signed-off-by: Ivan Babrou <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Requirement - what kind of business use case are you trying to solve?
Ingester should not fall over when partition is reassigned to another consumer.
Problem - what in Jaeger blocks you from solving the requirement?
There is an existing
jaeger-spans
queue (using protobuf encoder), with 6 partitions across 6 kafka nodes.When running with multiple ingesters (v1.8.0), each ingester in turn will crash with logs that look like the following:
This results in ingesters never being able to handle any spans, as each ingester only runs for a short time before crashing and being restarted.
Proposal - what do you suggest to solve the problem or improve the existing situation?
n/a
Any open questions to address
n/a
The text was updated successfully, but these errors were encountered: