Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track dropped spans and logs due to full buffer #2357

Merged

Conversation

scottgerring
Copy link
Contributor

@scottgerring scottgerring commented Nov 27, 2024

Fixes #2273

Design discussion issue (if applicable) #

Changes

As suggested in #2273, this change adds a counter and a metric for dropped spans and logs; the counter is used to log total drop count at exporter shutdown, the metric is used to export this information.

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@scottgerring scottgerring requested a review from a team as a code owner November 27, 2024 09:45
Copy link

linux-foundation-easycla bot commented Nov 27, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link

codecov bot commented Nov 27, 2024

Codecov Report

Attention: Patch coverage is 71.79487% with 11 lines in your changes missing coverage. Please review.

Project coverage is 79.5%. Comparing base (195dea8) to head (eb1f83f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
opentelemetry-sdk/src/trace/span_processor.rs 65.0% 7 Missing ⚠️
opentelemetry-sdk/src/logs/log_processor.rs 78.9% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #2357     +/-   ##
=======================================
- Coverage   79.5%   79.5%   -0.1%     
=======================================
  Files        123     123             
  Lines      21448   21482     +34     
=======================================
+ Hits       17061   17087     +26     
- Misses      4387    4395      +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! Left my comments in the PR.
OTel internal metrics is something we can add, but not right away:
https://github.com/open-telemetry/opentelemetry-rust/pull/2357/files#r1860843464

dropped_logs_count: AtomicUsize,

// Track the maximum queue size that was configured for this processor
max_queue_size: usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odd that we have to store this here just for logging purposes, but not an issue!

Copy link
Member

@lalitb lalitb Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can avoid this by moving the logging of dropped logs (otel_warn!) from the shutdown() method to the worker's Shutdown message processing. Also, dropped_logs_count to be shared with the shutdown worker and the processor object. Haven't tried, but if it seems to be complex, we can park it for separate PR.

Copy link
Member

@cijothomas cijothomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Left some suggestions for improved log messages.

@scottgerring
Copy link
Contributor Author

scottgerring commented Nov 27, 2024

Changes applied - should be good to go once the builds are done.
And - thanks for the quick review!

@cijothomas
Copy link
Member

@lalitb @utpilla Could you take a look? If I get one more approval, I plan to include this for today's release (0.27.1)

Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
Copy link
Contributor

@utpilla utpilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a suggestion to be consistent with the message content for both the batch processor.

@lalitb
Copy link
Member

lalitb commented Nov 27, 2024

Would be good to include a unit test to validate that the dropped logs count is correctly reflected during shutdown. Since the field is private, we could consider adding a test-only accessor method to retrieve it. That said, if this is time-sensitive and needs to be included in today's release, can be done separately.

@cijothomas
Copy link
Member

Would be good to include a unit test to validate that the dropped logs count is correctly reflected during shutdown. Since the field is private, we could consider adding a test-only accessor method to retrieve it. That said, if this is time-sensitive and needs to be included in today's release, can be done separately.

I think we have orchestrated such a test in OTel .NET, so we can copy some ideas.

@cijothomas cijothomas changed the title chore: Track dropped spans and logs due to full buffer Track dropped spans and logs due to full buffer Nov 27, 2024
@cijothomas
Copy link
Member

@scottgerring I edited the PR Title ("chore: Track dropped spans and logs due to full buffer"), to remove "chore" from it.

@cijothomas cijothomas merged commit cbe9ebe into open-telemetry:main Nov 27, 2024
20 of 23 checks passed
pitoniak32 pushed a commit to pitoniak32/opentelemetry-rust that referenced this pull request Dec 4, 2024
Co-authored-by: Cijo Thomas <[email protected]>
Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
Co-authored-by: Cijo Thomas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internal logging in BatchProcessor when buffer full
4 participants