Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent metric deduplication in Otel2PrometheusConverter #6277

Closed
nluk opened this issue Mar 7, 2024 · 1 comment · Fixed by #6308
Closed

Inconsistent metric deduplication in Otel2PrometheusConverter #6277

nluk opened this issue Mar 7, 2024 · 1 comment · Fixed by #6308
Labels
Bug Something isn't working

Comments

@nluk
Copy link

nluk commented Mar 7, 2024

Describe the bug
Prometheus MetricSnapshots constructor throws because of duplicated metric names. This is a part of the Otel2PrometheusConverter::convert call.

An Exception occurred while scraping metrics: java.lang.IllegalArgumentException: logback_events: duplicate metric name
	at io.prometheus.metrics.model.snapshots.MetricSnapshots.<init>(MetricSnapshots.java:43)
	at io.opentelemetry.exporter.prometheus.Otel2PrometheusConverter.convert(Otel2PrometheusConverter.java:112)
	at io.opentelemetry.exporter.prometheus.PrometheusMetricReader.collect(PrometheusMetricReader.java:56)
	at io.prometheus.metrics.model.registry.MultiCollector.collect(MultiCollector.java:26)
	at io.prometheus.metrics.model.registry.PrometheusRegistry.scrape(PrometheusRegistry.java:72)
	at io.prometheus.metrics.exporter.common.PrometheusScrapeHandler.scrape(PrometheusScrapeHandler.java:112)
	at io.prometheus.metrics.exporter.common.PrometheusScrapeHandler.handleRequest(PrometheusScrapeHandler.java:53)
	at io.prometheus.metrics.exporter.httpserver.MetricsHandler.handle(MetricsHandler.java:43)
	at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:95)
	at jdk.httpserver/sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:82)
	at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:98)
	at jdk.httpserver/sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:851)
	at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:95)
	at jdk.httpserver/sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:818)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Steps to reproduce
Create two MetricData objects with different names, that resolve to the same prometheusName and run convert.

Testcase to reproduce the error: main...nluk:opentelemetry-java:bug/prometheus-converter-duplicate-metric-name

What did you expect to see?

Hard to say really. Metrics of the same type may be merged, but the fact that there are two competing metrics a.b and a_b might indicate a misconfiguration.

What did you see instead?

The Otel2PrometheusConverter::putOrMerge method uses snapshot.getMetadata().getName() as a unique map key. So metric.name and metric_name are two separate entries.

Prometheus compares sorted snapshot.getMetadata().getPrometheusName() link to find collisions. This means that metric.name and metric_name are both represented as metric_name, and a collision is detected.

What version and what artifacts are you using?
Artifacts: aws-opentelemetry-agent 1.32.1, so opentelemetry 1.32.1 link that targets otel-sdk 1.34.1

Environment
Compiler: Temurin-17.0.5+8
OS: Ubuntu 22.04

@nluk nluk added the Bug Something isn't working label Mar 7, 2024
@jack-berg
Copy link
Member

Prometheus compares sorted snapshot.getMetadata().getPrometheusName() link to find collisions. This means that metric.name and metric_name are both represented as metric_name, and a collision is detected.

I think this getName vs. getPrometheusName is the key bit here. Its an oversight on our part to use getName for the keys of our map. Opening a PR to adjust it to use getPrometheusName everywhere.

But this won't completely solve the problem. Its still possible that my.metric and my_metric have overlapping series. I.e. both might have a series with attributes {key=a}. If this occurs, then the prometheus client will generate an error. I don't think Otel2PrometheusConverter should be responsible for this merging of series. Instead, if a user encounters this, they should use views to rename one of my.metric or my_metric such that the conflict doesn't occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants