-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OkHttp instrumentation generates metrics with very high cardinality #9972
Comments
this is definitely not expected, can you try using the latest okhttp library instrumentation directly to confirm the issue? |
For sure, I can provide a minimal reproduction case with latest okhttp instrumentation library. I cannot test my real project against latest okhttp instrumentation as I noticed the Trino JDBC driver which is the one defining the okhttp instrumentation is shading OTEL dependency. |
Hey @gaeljw , what version of the OpenTelemetry SDK are you using? I think there might be a version mismatch between the two, and as the result the metrics advice might not be applied. |
@mateuszrzeszutek I'm using OTEL SDK 1.32.0 and from what I can see the dependency I'm using is shading OTEL (API) 1.31.0. I can try to use OTEL SDK 1.31.0 to see if problem stil occurs. If that's what you're suggesting? I'll do it tomorrow or the day after. |
I was not able to reproduce the issue in a minimal case with only OTEL 1.32.0 and OkHttp. Neither in another case where I shaded OTEL API 1.31.0 in a module and used it in another module with OTEL SDK 1.32.0 (to be closer to the real case I'm experiencing). I'll investigate further! |
@mateuszrzeszutek you mentioned a "metrics advice", where is it defined / what should make it applied? |
Using SDK 1.31.0, same version as what Trino is shading. Still the same issue. Still trying to come up with a minimal reproduction repository... :) |
I was able to come up with a minimal reproduction case, it's not using directly OkHttp instrumentation but Trino library which itself uses OkHttp instrumentation. https://github.com/gaeljw/otel9972 I hope it will help. To reproduce:
You'll see a bunch of metrics having the issue. For the record, here are the top lines I get in my case:
You can see the (I'll cross-post the reproduction scenario to trinodb/trino#19958) If that helps, the Trino code that setups the instrumentation: https://github.com/trinodb/trino/blob/aa431c77a6a187920f5d6433532a19647d901742/client/trino-jdbc/src/main/java/io/trino/jdbc/NonRegisteringTrinoDriver.java#L69 |
This is a trinio issue. Trinio does not shade Line 24 in afe10b5
|
Thanks for the investigation @laurit . Sorry to bother you but two things are still unclear to me:
|
It is applied in Lines 23 to 37 in afe10b5
note that in the release version this code is slightly different
Instrumentation sets all the attributes. Advice determines which of these attributes are actually added to the metric by default. Advice can be overridden by configuring a metrics view. This allows adding attributes that are not included in the advice to the metric (or dropping attributes included in the advice). |
Oh okay, I got it I think :) There is a "standard definition" of Http Client metrics and Trino instrumentation is implementing it. This is from Line 58 in afe10b5
And you are saying that it still makes sense that the Now, back to the Trino issue, I may try to provide a fix in their implementation but I guess it won't be that easy because they'd want to shade everything excepted the Not clear to me yet how a shaded implementation can live together with the official implementation 🤔 I wonder if it even makes sense to shade OTEL on the 1st place given their usage. Maybe they should just mark the dependency as If you've any idea or opinion on this, I'd love to hear. Anyway thanks a lot for your time. |
I believe
I'm sure they are reasonable people and understand that things can't be shaded when it breaks. I'd try removing https://github.com/trinodb/trino/blob/aa431c77a6a187920f5d6433532a19647d901742/client/trino-jdbc/pom.xml#L424-L427 and adding dependency to https://central.sonatype.com/artifact/io.opentelemetry/opentelemetry-extension-incubator the same way as they have added https://github.com/trinodb/trino/blob/aa431c77a6a187920f5d6433532a19647d901742/client/trino-jdbc/pom.xml#L86-L90 (note the provided scope) |
Describe the bug
OkHttp instrumentation generates metrics
http_client_duration
with ahttp_url
andhttp_response_content_length
labels. By nature these are almost unique labels and thus each request generate a unique set of labels for the metric, or said differently a very high cardinality for these labels.This causes issue for storage.
Example of such metrics (Prometheus format):
As these are histograms, there is also a bunch of other metrics
http_client_duration_milliseconds_bucket
.I don't think these labels make much sense in the 1st place. Is this expected?
Related issue in a library using OkHttp instrumentation explicitly: trinodb/trino#19958
Steps to reproduce
N/A
(I can look to provide a reproduction project if needed but don't think it's necessary).
Expected behavior
No such labels. Low cardinality of this metric.
Actual behavior
See above.
Javaagent or library instrumentation version
1.32.0 (also observed in 1.29.0)
Environment
JDK: 11
OS: Linux
Additional context
No response
The text was updated successfully, but these errors were encountered: