Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM common metrics for Generative AI #955

Merged
merged 34 commits into from
May 28, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
da4fb55
Initial LLM metrics
drewby Apr 24, 2024
1169756
Add link references
drewby Apr 24, 2024
383fa1f
Add LLM Metrics to README
drewby Apr 24, 2024
c4308be
Add changelog
drewby Apr 24, 2024
1239fbd
Fix yamllint error on chloggen
drewby Apr 24, 2024
9565f2c
Update reference to LLM
drewby Apr 24, 2024
b579374
Change metric name to match semconv
drewby Apr 25, 2024
c938263
Add gen_ai.system
drewby Apr 25, 2024
d5b59dc
Updates for review comments
drewby Apr 25, 2024
2b942db
Rename/scope LLM to Gen AI metrics
drewby Apr 26, 2024
5335175
Remove trailing spaces
drewby Apr 26, 2024
4f415b3
Update operation examples.
drewby Apr 30, 2024
f3e6586
Replace pluralized tokens with token
drewby Apr 30, 2024
fd01e65
Update table of contents
drewby May 5, 2024
b1ccbd6
Update token type
drewby May 5, 2024
de89866
Update requirement levels
drewby May 7, 2024
cfd8e86
Override error.type note
drewby May 7, 2024
9ac4406
Allow custom values to true
drewby May 7, 2024
b2828f8
Add ExplicitBucketBoundaries
drewby May 7, 2024
979f732
Make token metric recommended
drewby May 7, 2024
84d78eb
Remove trailing space
drewby May 7, 2024
72cc2c9
Fix recommended label.
drewby May 8, 2024
04c6fb3
Update metrics to be for 'client'
drewby May 8, 2024
7db4c52
Update title
drewby May 8, 2024
d9ab4d8
Update registry table
drewby May 8, 2024
58b10b3
Move error.type from common to duration metric.
drewby May 15, 2024
b361e4a
Add clarifation on used vs billed tokens.
drewby May 15, 2024
aa02859
Regenerate tables
drewby May 22, 2024
b351811
Regenerate tables
drewby May 23, 2024
b90fdc5
Merge branch 'main' into drewby/llm_metrics
drewby May 23, 2024
e48e635
Merge branch 'open-telemetry:main' into drewby/llm_metrics
drewby May 24, 2024
2cd0b90
Remove unnecessary elements
drewby May 25, 2024
fa99804
Update description for error.type
drewby May 25, 2024
a405081
Merge branch 'main' into drewby/llm_metrics
lmolkova May 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Move error.type from common to duration metric.
  • Loading branch information
drewby committed May 23, 2024
commit 58b10b309f148480e586d53eb687687fe5204ec4
21 changes: 4 additions & 17 deletions docs/gen-ai/gen-ai-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,20 +52,13 @@ This metric SHOULD be specified with [ExplicitBucketBoundaries] of [1, 4, 16, 64
| [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the LLM a request is being made to. | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The name of the LLM foundation model vendor. | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`gen_ai.token.type`](/docs/attributes-registry/gen-ai.md) | string | The type of token being counted. | `input`; `output` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` if the operation ended in an error | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [2] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [1] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the LLM a response was generated from. | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [3] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [2] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |

**[1]:** The cardinality of `error.type` SHOULD be low.
**[1]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.

When working across multiple models, it is RECOMMENDED to use a common set of error types.

Additional details may be captured in domain-specific attributes.

**[2]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available.

**[3]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.
**[2]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available.

`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

Expand All @@ -79,12 +72,6 @@ Additional details may be captured in domain-specific attributes.
|---|---|---|
| `input` | Input tokens (prompt, input, etc.) | ![Experimental](https://img.shields.io/badge/-experimental-blue) |
| `output` | Output tokens (completion, response, etc.) | ![Experimental](https://img.shields.io/badge/-experimental-blue) |

`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.

| Value | Description | Stability |
|---|---|---|
| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
<!-- endsemconv -->

### Metric: `gen_ai.client.operation.duration`
Expand Down
19 changes: 10 additions & 9 deletions model/metrics/gen-ai.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,6 @@ groups:
type: attribute_group
brief: 'This group describes GenAI metrics attributes'
attributes:
- ref: error.type
requirement_level:
conditionally_required: "if the operation ended in an error"
note: |
The cardinality of `error.type` SHOULD be low.

When working across multiple models, it is RECOMMENDED to use a common set of error types.

Additional details may be captured in domain-specific attributes.
- ref: server.address
requirement_level: recommended
- ref: server.port
Expand Down Expand Up @@ -44,3 +35,13 @@ groups:
unit: "s"
stability: experimental
extends: metric_attributes.gen_ai
attributes:
- ref: error.type
requirement_level:
conditionally_required: "if the operation ended in an error"
note: |
The cardinality of `error.type` SHOULD be low.

When working across multiple models, it is RECOMMENDED to use a common set of error types.

Additional details may be captured in domain-specific attributes.