Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(add-option-to-disable-instrumentation-http-metrics #5029

Conversation

AkselAllas
Copy link
Contributor

I am getting errors from instrumentation-http metrics. They are not exporting correctly and I have my own custom http metrics instead. I want to be able to turn them off.

@AkselAllas AkselAllas requested a review from a team as a code owner October 1, 2024 13:14
@AkselAllas AkselAllas marked this pull request as draft October 1, 2024 13:14
@AkselAllas AkselAllas force-pushed the add-option-to-disable-instrumentation-http-metrics branch 2 times, most recently from 929bfe5 to 92d8647 Compare October 1, 2024 15:28
Copy link

codecov bot commented Oct 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.27%. Comparing base (f8ab559) to head (d918ac6).
Report is 28 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5029      +/-   ##
==========================================
- Coverage   93.39%   93.27%   -0.13%     
==========================================
  Files          46      317     +271     
  Lines         712     8194    +7482     
  Branches      120     1640    +1520     
==========================================
+ Hits          665     7643    +6978     
- Misses         47      551     +504     

see 266 files with indirect coverage changes

@AkselAllas AkselAllas force-pushed the add-option-to-disable-instrumentation-http-metrics branch from 92d8647 to ba4f654 Compare October 1, 2024 15:36
@AkselAllas AkselAllas force-pushed the add-option-to-disable-instrumentation-http-metrics branch from ba4f654 to d918ac6 Compare October 1, 2024 15:46
@AkselAllas AkselAllas marked this pull request as ready for review October 2, 2024 05:01
@pichlermarc
Copy link
Member

@AkselAllas what about adding a view that uses a drop aggregation for all metrics from the @opentelemetry/instrumentation-http meter?
Like so (and then passed to the MeterProvider constructor or NodeSDK constructor)

new View({
        meterName: '@opentelemetry/instrumentation-http',
        aggregation: Aggregation.Drop()
    })

@pichlermarc
Copy link
Member

@AkselAllas which errors are you getting? This sounds like there must be something wrong with either the SDK or the exporter. 🤔

@AkselAllas
Copy link
Contributor Author

@pichlermarc I am getting:

rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: One or more points were written more frequently than the maximum sampling period configured for the metric.: prometheus_target{location:us,instance:b5bbba3b-a99d-4881-b965-b497b6ecab38,cluster:,namespace:,job:example} timeSeries[0-8]: prometheus.googleapis.com/http_client_duration_milliseconds/histogram{http_method:POST,http_flavor:1.1,http_status_code:200,net_peer_port:443,otel_scope_name:@opentelemetry/instrumentation-http,net_peer_name:example,otel_scope_version:0.52.1}
error details: name = Unknown  desc = total_point_count:200  success_point_count:191  errors:{status:{code:9}  point_count:9}

None of my own custom metrics are generating any errors, so I would prefer to not use these metrics. We don't use any of them either.

Possibility of using View is also an option for me indeed. Thanks 🙇

@AkselAllas
Copy link
Contributor Author

I think currently instrumentation-http or any other instrumentations are definitely missing documenting that they even do this automatic reporting of metrics.

When these metrics are left unused, they are just a cost for the end user.

@pichlermarc
Copy link
Member

None of my own custom metrics are generating any errors, so I would prefer to not use these metrics. We don't use any of them either.

I'm sorry for the slow responses, but I'm not too familar with prometheus, so reading up on it and tinkering myself takes a lot of time. Ideally I'd need some some kind of reproducer that I can run myself to truly dig into the problem. It does look however, that there should be a way to use these metrics and without generating errors like this - disabling them via a view gets rid of the symptom but it looks like there's a need to fix an underlying problem. Would you mind opening an issue instead? 🤔

In the meantime, having a View to disable them should be roughly as performant as the solution proposed in this PR.

I think currently instrumentation-http or any other instrumentations are definitely missing documenting that they even do this automatic reporting of metrics.

When these metrics are left unused, they are just a cost for the end user.

@dyladan adds some docs about which metrics are exported over at #5026.

@AkselAllas
Copy link
Contributor Author

AkselAllas commented Oct 4, 2024

@pichlermarc No problem. Thanks for the in-depth response!

I'm 90% sure this error is related to me calling forceFlush at the end of all GCP Cloud Function calls, which I detailed in this otel-collector issue. At the moment I don't have time priority to differentiate why only instrumentation-http metric fails and create an issue with reproducible failure scenario. The reproduced case might be valid for legacy serverless scenarios anyhow.

I will close this PR, instead I just recommend maintainers to write (and link example) about the suggested way of using View for dropping the metrics into instrumentation-http README.

@AkselAllas AkselAllas closed this Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants