-
Notifications
You must be signed in to change notification settings - Fork 166
Standard system metrics and semantic conventions #119
Standard system metrics and semantic conventions #119
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. It may also be worth defining the data type (Int64 or Double)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Julien from New Relic - I work on our infrastructure product and I have a couple of comments / questions. Sorry if they are obvious, I'm still getting up to speed with OTEL!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me. I especially like "usage" and "utilization" as standard names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely good enough to be approved as an OTEP and move on to the spec itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I believe this PR is ready to be merged but when writing this up for the specs repo, it would be good to add a convention for process counts (with "state" = running / inactive) |
|----------------------|-------|-----------------|----------|---------|-----------------------------------| | ||
|system.cpu.time |seconds|SumObserver |Double |state |idle, user, system, interrupt, etc.| | ||
| | | | |cpu |1 - #cores | | ||
|system.cpu.utilization|1 |UpDownSumObserver|Double |state |idle, user, system, interrupt, etc.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/UpDownSumObserver/ValueObserver
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
Conventions from [OTEP 119](open-telemetry/oteps#119)
* System metrics semantic conventions Conventions from [OTEP 119](open-telemetry/oteps#119) * change process count to UpDownSumObserver * fix system.cpu.utilization, use better example * first several comments * add description columns, update units to UCUM * markdown-toc * clarify OS process level metrics * clarify load average exapmle * move general conventions + OTEP 108 into README.md * renamed swap -> paging * add addition fs labels * fix links * fix link * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <[email protected]> * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <[email protected]> * Apply suggestions from code review Co-authored-by: Tigran Najaryan <[email protected]> * fix tigran comments * add disk io_time and operation_time * add descriptions/footnotes for dropped packets and net errors * lint, more info for net dropped packets/errors * "dropped_packets" -> "dropped" * Apply suggestions from James' code review Co-authored-by: James Bebbington <[email protected]> * comments from James' code review * clarify windows perf counter * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Joshua MacDonald <[email protected]> * reflow text Co-authored-by: Tigran Najaryan <[email protected]> Co-authored-by: James Bebbington <[email protected]> Co-authored-by: Joshua MacDonald <[email protected]>
* System metrics semantic conventions Conventions from [OTEP 119](open-telemetry/oteps#119) * change process count to UpDownSumObserver * fix system.cpu.utilization, use better example * first several comments * add description columns, update units to UCUM * markdown-toc * clarify OS process level metrics * clarify load average exapmle * move general conventions + OTEP 108 into README.md * renamed swap -> paging * add addition fs labels * fix links * fix link * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <[email protected]> * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <[email protected]> * Apply suggestions from code review Co-authored-by: Tigran Najaryan <[email protected]> * fix tigran comments * add disk io_time and operation_time * add descriptions/footnotes for dropped packets and net errors * lint, more info for net dropped packets/errors * "dropped_packets" -> "dropped" * Apply suggestions from James' code review Co-authored-by: James Bebbington <[email protected]> * comments from James' code review * clarify windows perf counter * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Joshua MacDonald <[email protected]> * reflow text Co-authored-by: Tigran Najaryan <[email protected]> Co-authored-by: James Bebbington <[email protected]> Co-authored-by: Joshua MacDonald <[email protected]>
* System metrics semantic conventions Conventions from [OTEP 119](open-telemetry/oteps#119) * change process count to UpDownSumObserver * fix system.cpu.utilization, use better example * first several comments * add description columns, update units to UCUM * markdown-toc * clarify OS process level metrics * clarify load average exapmle * move general conventions + OTEP 108 into README.md * renamed swap -> paging * add addition fs labels * fix links * fix link * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <[email protected]> * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Tigran Najaryan <[email protected]> * Apply suggestions from code review Co-authored-by: Tigran Najaryan <[email protected]> * fix tigran comments * add disk io_time and operation_time * add descriptions/footnotes for dropped packets and net errors * lint, more info for net dropped packets/errors * "dropped_packets" -> "dropped" * Apply suggestions from James' code review Co-authored-by: James Bebbington <[email protected]> * comments from James' code review * clarify windows perf counter * Update specification/metrics/semantic_conventions/README.md Co-authored-by: Joshua MacDonald <[email protected]> * reflow text Co-authored-by: Tigran Najaryan <[email protected]> Co-authored-by: James Bebbington <[email protected]> Co-authored-by: Joshua MacDonald <[email protected]>
* standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <[email protected]>
* standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <[email protected]>
* standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <[email protected]>
…s#119) * standard system and runtime metric names * added more conventions and tables * formatting * cleanup writing/grammar * Made tables shorter, cleaned up, added runtime overview * more small fixes * Tweaks and moved "Open Questions" to the end * added PR number to filename * lint * Update tables, add runtime examples, from review * More edits addressing review comments - Clarify these are metric instrument names (not "metrics") - Remove discussion points I left inline - Add unresolved comments from review to open questions * add open question on versioning * removed open question about versioning * unabbreviate "net" and "ops" Co-authored-by: Bogdan Drutu <[email protected]>
See open-telemetry/opentelemetry-specification#651. This OTEP proposes some standard system metric names as well as semantic conventions for naming system/runtime metrics. This mostly follows the work done in #108 and the Collector. I left a few TODOs and open questions, the biggest things being standard runtime metrics and process metrics.