-
Notifications
You must be signed in to change notification settings - Fork 166
Metric naming conventions #108
Metric naming conventions #108
Conversation
81dc33c
to
a59a7fd
Compare
Meta question: is the usage of |
I think we will want to include the measures / data types for the metrics prescribed in this document. Can someone take a pass at that? I am not as familiar with the nomenclature and options in this regard 😳 |
Prior work:
|
Before we get to specific names, I'd like to see a meta-OTEP with the guiding principles for instrument naming and measurement labeling, so we can all agree to that before we start digging into individual names. This PR briefly mentions some guiding principles, and then dives into system metrics. Separating those two concerns into separate OTEPs will, I think, make it easier to discuss them in isolation, rather than muddling the discussion of general guidelines with specifics. Thoughts? |
@jkwatson that's a great point re generic conventions vs specific recommendations. I found it very difficult to talk about the generic conventions without examples, which is what brought me down this path. I will give it some thought. Ultimately I think we will want a single document -- or extensive examples in the primary document with an appendix for further details. There's a lot to discuss with both topics, so it would definitely be easier to discuss one thing at a time. |
That's fair. having examples that are outside of system metrics would be helpful, then. Things like http calls, database calls, and rpc calls will help add depth to the discussion, I think. |
@james-bebbington from Google gets the credit for the awesome spreadsheet but we (Splunk) have been working with him on defining the set of host metrics. |
@tedpennings I think we should do what @jkwatson proposed which is to define how the naming structure looks like for metrics and for labels. |
e206fab
to
63443ef
Compare
@bogdandrutu I'm not sure if you were planning to apply the per-signal approvers changes in this repository? Either way, this has enough approvals from the metrics approvers and could be merged. |
Was there any discussion around versioning in names? I suppose this OTEP covers it implicitly since you could prefix everything, like This could be important for any backends that have a strict schema for metrics and their labels (e.g. Google Cloud Monitoring with MetricDescriptors) to prevent breaking changes. |
The hierarchical structure of metrics defines the namespacing. Supporting OpenTelemetry artifacts define the metric structures and hierarchies for some categories of metrics, and these can assist decisions when creating future metrics. | ||
|
||
Common labels SHOULD be consistently named. This aids in discoverability and disambiguates similar labels to metric names. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Common labels SHOULD be consistently named
What does "common labels" mean? Common in what context/scope? A single service? An organization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to avoid a debate over naming label keys. For example, you have a label key named "service" and have used it on some metrics, and I have a label key named "service" and have used it on some different metrics. How are we to know that those labels are not the same? The answer would be to add namespacing of labels. I recall the OpenCensus guidelines were to prefix your label names with a DNS prefix that you own. So I might have a lightstep.com/service
label and you might have an uber.com/service
label. For this to create a good user experience, I'd like the DNS prefix to not display by default. What would you like to see, @yurishkuro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I read this guidance is: if we have a label that should be added to many different categories of metric instrument, and that label's semantic meaning is the same across all those categories, its name should be consistent.
The most obvious example I can think of would be status, whose value will be a CanonicalSpanStatus.
As a user, I would find it intuitive when searching my metrics in my UI to always find the success/failure information under the same status
label.
I'm not sure I understand the example service
label. Would it be the name of the service being instrumented? If so, perhaps we would want some semantic conventions around how to apply Resource attributes as metric labels.
If this is the case, I'm not sure if we need to change this line. Is this guidance not clear enough? What wording would make our meaning more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safe to say we can merge this and debate this topic again as we modify the specification.
@jmacd there are some comments left open, will merge once this is done. @tedpennings please fix all open comments. |
@bogdandrutu will take a look later tonight, thanks! sorry for delay, I had family medical things come up over the past few days. things are better now, and I'm catching up on work now. |
@tedpennings friendly ping :) |
In the metrics SIG meeting on July 9, I was volunteered to help wrap up this PR. I'm just now diving into it, but I'll try to get open questions/issues resolved before Thursday July 16. |
@open-telemetry/technical-committee, It seems that we have broad approval on this OTEP, with five approvals so far. The only outstanding issue is with the specific definition of "common" labels. I'd like to propose that we merge this OTEP and sort out the details in a spec at a later date. |
@open-telemetry/specs-metrics-approvers please merge. |
* Proposal for metric naming conventions * Add Node example metrics * Node.js instead of Node * Rename file, add Prometheus quote * Second round of revisions * Working group feedback * More feedback changes * Minor clarifications * Word choice * Whitespace to check CLA status * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Code review feedback, remove discussion section * Remove some discussion topics, and fix an example * Rename OTEP 108 to metric naming _guidelines_ Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: jfoote <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]>
* Proposal for metric naming conventions * Add Node example metrics * Node.js instead of Node * Rename file, add Prometheus quote * Second round of revisions * Working group feedback * More feedback changes * Minor clarifications * Word choice * Whitespace to check CLA status * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Code review feedback, remove discussion section * Remove some discussion topics, and fix an example * Rename OTEP 108 to metric naming _guidelines_ Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: jfoote <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]>
* Proposal for metric naming conventions * Add Node example metrics * Node.js instead of Node * Rename file, add Prometheus quote * Second round of revisions * Working group feedback * More feedback changes * Minor clarifications * Word choice * Whitespace to check CLA status * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Code review feedback, remove discussion section * Remove some discussion topics, and fix an example * Rename OTEP 108 to metric naming _guidelines_ Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: jfoote <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]>
* Proposal for metric naming conventions * Add Node example metrics * Node.js instead of Node * Rename file, add Prometheus quote * Second round of revisions * Working group feedback * More feedback changes * Minor clarifications * Word choice * Whitespace to check CLA status * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Update text/metrics/0108-naming-conventions.md Co-authored-by: Tyler Yahn <[email protected]> * Code review feedback, remove discussion section * Remove some discussion topics, and fix an example * Rename OTEP 108 to metric naming _guidelines_ Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: jfoote <[email protected]> Co-authored-by: Yuri Shkuro <[email protected]>
This proposal follows the discussion we've been having in the metrics SIG.
There are TODOs in the doc. Many sections require expertise in a specific domain; I'm very eager for discussion and suggestions.
This is written from my perspective as a UI/product engineer who consumes and presents metric data. I think is a bit different from the perspective of an instrumentation engineer or data platform engineer. Feedback extra extra welcome from y'all in those roles, and submitted with the caveat that there will be impedances in this regard.