From 268beb4ca3605e7f74dce655f1666dde724b72d1 Mon Sep 17 00:00:00 2001 From: Liudmila Molkova Date: Tue, 9 Jul 2024 13:36:02 -0700 Subject: [PATCH 1/6] HTTP span status: use SHOULD instead of MUST for errors (#1167) Co-authored-by: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> --- .chloggen/1167.yaml | 4 ++++ docs/http/http-spans.md | 20 ++++++++++++++++---- 2 files changed, 20 insertions(+), 4 deletions(-) create mode 100644 .chloggen/1167.yaml diff --git a/.chloggen/1167.yaml b/.chloggen/1167.yaml new file mode 100644 index 0000000000..cc9655329e --- /dev/null +++ b/.chloggen/1167.yaml @@ -0,0 +1,4 @@ +change_type: bug_fix +component: http +note: "Relax requirement on when to set HTTP span status to Error from `MUST` to `SHOULD`." +issues: [1167, 1003] diff --git a/docs/http/http-spans.md b/docs/http/http-spans.md index 180d595d41..a22e33506b 100644 --- a/docs/http/http-spans.md +++ b/docs/http/http-spans.md @@ -93,19 +93,31 @@ Instrumentation MUST NOT default to using URI path as a `{target}`. the response body; or 3xx codes with max redirects exceeded), in which case status MUST be set to `Error`. +> **Note:** +> +> The classification of an HTTP status code as an error depends on the context. +> For example, a 404 "Not Found" status code indicates an error if the application +> expected the resource to be available. However, it is not an error when the +> application is simply checking whether the resource exists. +> +> Instrumentations that have additional context about a specific request MAY use +> this context to set the span status more precisely. +> Instrumentations that don't have any additional context MUST follow the +> guidelines in this section. + For HTTP status codes in the 4xx range span status MUST be left unset in case of `SpanKind.SERVER` -and MUST be set to `Error` in case of `SpanKind.CLIENT`. +and SHOULD be set to `Error` in case of `SpanKind.CLIENT`. For HTTP status codes in the 5xx range, as well as any other code the client -failed to interpret, span status MUST be set to `Error`. +failed to interpret, span status SHOULD be set to `Error`. Don't set the span status description if the reason can be inferred from `http.response.status_code`. HTTP request may fail if it was cancelled or an error occurred preventing the client or server from sending/receiving the request/response fully. -When instrumentation detects such errors it MUST set span status to `Error` -and MUST set the `error.type` attribute. +When instrumentation detects such errors it SHOULD set span status to `Error` +and SHOULD set the `error.type` attribute. ## HTTP client From 92bd20553f969466cdc34472ce605b273dcbe017 Mon Sep 17 00:00:00 2001 From: Liudmila Molkova Date: Tue, 9 Jul 2024 17:33:49 -0700 Subject: [PATCH 2/6] Convert `gen_ai.operation.name` to enum and use it on spans (#1202) --- .chloggen/1202.yaml | 4 + docs/attributes-registry/gen-ai.md | 28 ++++- docs/gen-ai/gen-ai-metrics.md | 173 +++++++++++++++++++++-------- docs/gen-ai/gen-ai-spans.md | 40 +++++-- model/registry/gen-ai.yaml | 21 +++- model/trace/gen-ai.yaml | 2 + 6 files changed, 206 insertions(+), 62 deletions(-) create mode 100644 .chloggen/1202.yaml diff --git a/.chloggen/1202.yaml b/.chloggen/1202.yaml new file mode 100644 index 0000000000..32e4265029 --- /dev/null +++ b/.chloggen/1202.yaml @@ -0,0 +1,4 @@ +change_type: enhancement +component: gen_ai +note: Convert `gen_ai.operation.name` to enum and use it on spans +issues: [ 1202 ] diff --git a/docs/attributes-registry/gen-ai.md b/docs/attributes-registry/gen-ai.md index 753b0c35e1..a9c090a32b 100644 --- a/docs/attributes-registry/gen-ai.md +++ b/docs/attributes-registry/gen-ai.md @@ -16,8 +16,8 @@ This document defines the attributes used to describe telemetry in the context o | Attribute | Type | Description | Examples | Stability | | ---------------------------------- | -------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------- | ---------------------------------------------------------------- | | `gen_ai.completion` | string | The full response received from the GenAI model. [1] | `[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.operation.name` | string | The name of the operation being performed. | `chat`; `completion` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.prompt` | string | The full prompt sent to the GenAI model. [2] | `[{'role': 'user', 'content': 'What is the capital of France?'}]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.operation.name` | string | The name of the operation being performed. [2] | `chat`; `text_completion` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.prompt` | string | The full prompt sent to the GenAI model. [3] | `[{'role': 'user', 'content': 'What is the capital of France?'}]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.request.frequency_penalty` | double | The frequency penalty setting for the GenAI request. | `0.1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.request.max_tokens` | int | The maximum number of tokens the model generates for a request. | `100` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.request.model` | string | The name of the GenAI model a request is being made to. | `gpt-4` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | @@ -29,17 +29,33 @@ This document defines the attributes used to describe telemetry in the context o | `gen_ai.response.finish_reasons` | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.response.model` | string | The name of the model that generated the response. | `gpt-4-0613` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.system` | string | The Generative AI product as identified by the client or server instrumentation. [3] | `openai` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.system` | string | The Generative AI product as identified by the client or server instrumentation. [4] | `openai` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.token.type` | string | The type of token being counted. | `input`; `output` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.usage.input_tokens` | int | The number of tokens used in the GenAI input (prompt). | `100` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `gen_ai.usage.output_tokens` | int | The number of tokens used in the GenAI response (completion). | `180` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) -**[2]:** It's RECOMMENDED to format prompts as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) +**[2]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. -**[3]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. +**[3]:** It's RECOMMENDED to format prompts as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) + +**[4]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. + +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. + +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +| ----------------- | -------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. diff --git a/docs/gen-ai/gen-ai-metrics.md b/docs/gen-ai/gen-ai-metrics.md index 9f8e165cff..800741360b 100644 --- a/docs/gen-ai/gen-ai-metrics.md +++ b/docs/gen-ai/gen-ai-metrics.md @@ -71,22 +71,39 @@ This metric SHOULD be specified with [ExplicitBucketBoundaries] of [1, 4, 16, 64 | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. | `chat`; `completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. [1] | `chat`; `text_completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [1] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.token.type`](/docs/attributes-registry/gen-ai.md) | string | The type of token being counted. | `input`; `output` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [2] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [3] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [3] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [4] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. + +**[2]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. -**[1]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. -**[2]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + +**[3]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. + +**[4]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[3]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. @@ -144,24 +161,33 @@ This metric SHOULD be specified with [ExplicitBucketBoundaries] of [ 0.01, 0.02, | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. | `chat`; `completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. [1] | `chat`; `text_completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [1] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [2] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` if the operation ended in an error | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [3] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [3] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` if the operation ended in an error | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [4] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [4] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [5] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. -**[1]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. +**[2]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. -**[2]:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library, +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. + +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + +**[3]:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library, the canonical name of exception that occurred, or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report. -**[3]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[4]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[4]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[5]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. @@ -172,6 +198,14 @@ Instrumentations SHOULD document the list of errors they report. | `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | @@ -227,24 +261,33 @@ This metric SHOULD be specified with [ExplicitBucketBoundaries] of | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. | `chat`; `completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. [1] | `chat`; `text_completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [1] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [2] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` if the operation ended in an error | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [3] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [3] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` if the operation ended in an error | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [4] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [4] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [5] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. -**[1]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. +**[2]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. -**[2]:** The `error.type` SHOULD match the error code returned by the Generative AI service, +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. + +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + +**[3]:** The `error.type` SHOULD match the error code returned by the Generative AI service, the canonical name of exception that occurred, or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report. -**[3]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[4]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[4]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[5]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. @@ -255,6 +298,14 @@ Instrumentations SHOULD document the list of errors they report. | `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | @@ -310,20 +361,37 @@ This metric SHOULD be specified with [ExplicitBucketBoundaries] of | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. | `chat`; `completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. [1] | `chat`; `text_completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [1] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [2] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [3] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [3] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [4] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. + +**[2]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. + +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. + +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + +**[3]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. + +**[4]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[1]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. -**[2]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[3]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. +| Value | Description | Stability | +|---|---|---| +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. @@ -380,20 +448,37 @@ This metric SHOULD be specified with [ExplicitBucketBoundaries] of | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. | `chat`; `completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. [1] | `chat`; `text_completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [1] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [2] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [3] | `80`; `8080`; `443` | `Conditionally Required` If `sever.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [3] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [4] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | + +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. -**[1]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. +**[2]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. -**[2]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. -**[3]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + +**[3]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[4]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. + + + +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. diff --git a/docs/gen-ai/gen-ai-spans.md b/docs/gen-ai/gen-ai-spans.md index 1615d18c48..e50cd6a05d 100644 --- a/docs/gen-ai/gen-ai-spans.md +++ b/docs/gen-ai/gen-ai-spans.md @@ -10,6 +10,7 @@ linkTitle: Generative AI traces +- [Name](#name) - [Configuration](#configuration) - [GenAI attributes](#genai-attributes) - [Events](#events) @@ -20,8 +21,11 @@ A request to an Generative AI is modeled as a span in a trace. **Span kind:** MUST always be `CLIENT`. -The **span name** SHOULD be set to a low cardinality value describing an operation made to an Generative AI system. -For example, the API name such as [Create chat completion](https://platform.openai.com/docs/api-reference/chat/create) could be represented as `completion gpt-4` to include the API and the model name. +## Name + +GenAI spans MUST follow the overall [guidelines for span names](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.33.0/specification/trace/api.md#span). +The **span name** SHOULD be `{gen_ai.operation.name} {gen_ai.request.model}`. +Semantic conventions for individual GenAI systems and frameworks MAY specify different span name format. ## Configuration @@ -45,8 +49,9 @@ These attributes track input data and metadata for a request to an GenAI model. | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. [1] | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.operation.name`](/docs/attributes-registry/gen-ai.md) | string | The name of the operation being performed. [1] | `chat`; `text_completion` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.request.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the GenAI model a request is being made to. [2] | `gpt-4` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [3] | `openai` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.frequency_penalty`](/docs/attributes-registry/gen-ai.md) | double | The frequency penalty setting for the GenAI request. | `0.1` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.max_tokens`](/docs/attributes-registry/gen-ai.md) | int | The maximum number of tokens the model generates for a request. | `100` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.request.presence_penalty`](/docs/attributes-registry/gen-ai.md) | double | The presence penalty setting for the GenAI request. | `0.1` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | @@ -56,18 +61,35 @@ These attributes track input data and metadata for a request to an GenAI model. | [`gen_ai.request.top_p`](/docs/attributes-registry/gen-ai.md) | double | The top_p sampling setting for the GenAI request. | `1.0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.response.finish_reasons`](/docs/attributes-registry/gen-ai.md) | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.response.id`](/docs/attributes-registry/gen-ai.md) | string | The unique identifier for the completion. | `chatcmpl-123` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. [3] | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`gen_ai.response.model`](/docs/attributes-registry/gen-ai.md) | string | The name of the model that generated the response. [4] | `gpt-4-0613` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.usage.input_tokens`](/docs/attributes-registry/gen-ai.md) | int | The number of tokens used in the GenAI input (prompt). | `100` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`gen_ai.usage.output_tokens`](/docs/attributes-registry/gen-ai.md) | int | The number of tokens used in the GenAI response (completion). | `180` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -**[1]:** The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. + +**[2]:** The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + +**[3]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. + +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. + +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. -**[2]:** The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. -For custom model, a custom friendly name SHOULD be used. If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. +**[4]:** If available. The name of the GenAI model that provided the response. If the model is supplied by a vendor, then the value must be the exact name of the model actually used. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. -**[3]:** If available. The name of the GenAI model that provided the response. If the model is supplied by a vendor, then the value must be the exact name of the model actually used. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. +`gen_ai.operation.name` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `chat` | Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `text_completion` | Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions) | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + `gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. diff --git a/model/registry/gen-ai.yaml b/model/registry/gen-ai.yaml index 06b994bfc0..ce6755a1db 100644 --- a/model/registry/gen-ai.yaml +++ b/model/registry/gen-ai.yaml @@ -26,7 +26,10 @@ groups: value: "cohere" brief: 'Cohere' brief: The Generative AI product as identified by the client or server instrumentation. - note: > + note: | + The `gen_ai.system` describes a family of GenAI models with specific model identified + by `gen_ai.request.model` and `gen_ai.response.model` attributes. + The actual GenAI product may differ from the one identified by the client. For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` is set to `openai` based on the instrumentation's best knowledge. @@ -128,6 +131,18 @@ groups: examples: ["[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]"] - id: operation.name stability: experimental - type: string + type: + members: + - id: chat + value: "chat" + brief: 'Chat completion operation such as [OpenAI Chat API](https://platform.openai.com/docs/api-reference/chat)' + stability: experimental + - id: text_completion + value: "text_completion" + brief: 'Text completions operation such as [OpenAI Completions API (Legacy)](https://platform.openai.com/docs/api-reference/completions)' + stability: experimental brief: The name of the operation being performed. - examples: ['chat', 'completion'] + note: > + If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic + conventions for specific GenAI system and use system-specific name in the instrumentation. + If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. diff --git a/model/trace/gen-ai.yaml b/model/trace/gen-ai.yaml index 1c2418efab..d036c19755 100644 --- a/model/trace/gen-ai.yaml +++ b/model/trace/gen-ai.yaml @@ -12,6 +12,8 @@ groups: The name of the GenAI model a request is being made to. If the model is supplied by a vendor, then the value must be the exact name of the model requested. If the model is a fine-tuned custom model, the value should have a more specific name than the base model that's been fine-tuned. + - ref: gen_ai.operation.name + requirement_level: required - ref: gen_ai.request.max_tokens requirement_level: recommended - ref: gen_ai.request.temperature From f17a4bf973d32a0888a8219a408397f8eeac12ed Mon Sep 17 00:00:00 2001 From: jason plumb <75337021+breedx-splk@users.noreply.github.com> Date: Tue, 9 Jul 2024 17:37:04 -0700 Subject: [PATCH 3/6] [chore] Remove SHOULD for compiler name in `process.runtime.name`. (#1221) Co-authored-by: Liudmila Molkova --- docs/attributes-registry/process.md | 2 +- docs/resource/process.md | 2 +- model/registry/process.yaml | 3 +-- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/attributes-registry/process.md b/docs/attributes-registry/process.md index 6522b0ace3..9b47e08949 100644 --- a/docs/attributes-registry/process.md +++ b/docs/attributes-registry/process.md @@ -33,7 +33,7 @@ An operating system process. | `process.real_user.id` | int | The real user ID (RUID) of the process. | `1000` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `process.real_user.name` | string | The username of the real user of the process. | `operator` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `process.runtime.description` | string | An additional description about the runtime of the process, for example a specific vendor customization of the runtime environment. | `Eclipse OpenJ9 Eclipse OpenJ9 VM openj9-0.21.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `process.runtime.name` | string | The name of the runtime of this process. For compiled native binaries, this SHOULD be the name of the compiler. | `OpenJDK Runtime Environment` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `process.runtime.name` | string | The name of the runtime of this process. | `OpenJDK Runtime Environment` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `process.runtime.version` | string | The version of the runtime of this process, as returned by the runtime without modification. | `14.0.2` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `process.saved_user.id` | int | The saved user ID (SUID) of the process. | `1002` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | `process.saved_user.name` | string | The username of the saved user. | `operator` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | diff --git a/docs/resource/process.md b/docs/resource/process.md index a081022d6f..ff17dfcfbf 100644 --- a/docs/resource/process.md +++ b/docs/resource/process.md @@ -91,7 +91,7 @@ On Windows and other systems where the native format of process commands is a si | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| | [`process.runtime.description`](/docs/attributes-registry/process.md) | string | An additional description about the runtime of the process, for example a specific vendor customization of the runtime environment. | `Eclipse OpenJ9 Eclipse OpenJ9 VM openj9-0.21.0` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`process.runtime.name`](/docs/attributes-registry/process.md) | string | The name of the runtime of this process. For compiled native binaries, this SHOULD be the name of the compiler. | `OpenJDK Runtime Environment` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`process.runtime.name`](/docs/attributes-registry/process.md) | string | The name of the runtime of this process. | `OpenJDK Runtime Environment` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`process.runtime.version`](/docs/attributes-registry/process.md) | string | The version of the runtime of this process, as returned by the runtime without modification. | `14.0.2` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | diff --git a/model/registry/process.yaml b/model/registry/process.yaml index 7cb7451e66..d9c0575ae7 100644 --- a/model/registry/process.yaml +++ b/model/registry/process.yaml @@ -130,8 +130,7 @@ groups: type: string stability: experimental brief: > - The name of the runtime of this process. For compiled native binaries, - this SHOULD be the name of the compiler. + The name of the runtime of this process. examples: ['OpenJDK Runtime Environment'] - id: runtime.version type: string From 0e56677c841b4472001eb2acd6d74598b047ec6f Mon Sep 17 00:00:00 2001 From: Trask Stalnaker Date: Thu, 11 Jul 2024 15:25:32 -0700 Subject: [PATCH 4/6] Editorial: mention that Redis `db.namespace` is a string (#1233) --- docs/database/redis.md | 2 +- model/trace/database.yaml | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/database/redis.md b/docs/database/redis.md index bbe41a21c2..564e119af6 100644 --- a/docs/database/redis.md +++ b/docs/database/redis.md @@ -23,7 +23,7 @@ described on this page. | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`db.namespace`](/docs/attributes-registry/db.md) | string | The index of the database being accessed as used in the [`SELECT` command](https://redis.io/commands/select). [1] | `0`; `1`; `15` | `Conditionally Required` If and only if it can be captured reliably. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.namespace`](/docs/attributes-registry/db.md) | string | The index of the database being accessed as used in the [`SELECT` command](https://redis.io/commands/select) (captured as a string). [1] | `0`; `1`; `15` | `Conditionally Required` If and only if it can be captured reliably. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the operation or command being executed. [2] | `findAndModify`; `HMSET`; `SELECT` | `Conditionally Required` [3] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [4] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [5] | `80`; `8080`; `443` | `Conditionally Required` [6] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | diff --git a/model/trace/database.yaml b/model/trace/database.yaml index 24d92636dd..8140688b97 100644 --- a/model/trace/database.yaml +++ b/model/trace/database.yaml @@ -162,7 +162,8 @@ groups: attributes: - ref: db.namespace brief: > - The index of the database being accessed as used in the [`SELECT` command](https://redis.io/commands/select). + The index of the database being accessed as used in the [`SELECT` command](https://redis.io/commands/select) + (captured as a string). requirement_level: conditionally_required: If and only if it can be captured reliably. note: > From 3f936c6a8f2a9141735526977b4d2e562e9d6e64 Mon Sep 17 00:00:00 2001 From: Trask Stalnaker Date: Thu, 11 Jul 2024 15:45:52 -0700 Subject: [PATCH 5/6] Define sampling relevant attributes for database client spans (#1166) Co-authored-by: Liudmila Molkova --- .chloggen/1166.yaml | 22 ++++++++++++++ docs/database/cassandra.md | 12 +++++++- docs/database/cosmosdb.md | 12 +++++++- docs/database/couchdb.md | 10 ++++++- docs/database/database-spans.md | 11 +++++++ docs/database/elasticsearch.md | 14 ++++++++- docs/database/hbase.md | 11 ++++++- docs/database/mongodb.md | 11 ++++++- docs/database/mssql.md | 12 +++++++- docs/database/redis.md | 11 ++++++- docs/database/sql.md | 9 ++++++ model/trace/database.yaml | 51 +++++++++++++++++++++++++++------ 12 files changed, 170 insertions(+), 16 deletions(-) create mode 100644 .chloggen/1166.yaml diff --git a/.chloggen/1166.yaml b/.chloggen/1166.yaml new file mode 100644 index 0000000000..2202ecc8d1 --- /dev/null +++ b/.chloggen/1166.yaml @@ -0,0 +1,22 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: breaking + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: db + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Sampling relevant attributes defined for database client spans + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [ 1019 ] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/docs/database/cassandra.md b/docs/database/cassandra.md index fec15f7cf0..3b8837df60 100644 --- a/docs/database/cassandra.md +++ b/docs/database/cassandra.md @@ -10,7 +10,7 @@ The Semantic Conventions for [Cassandra](https://cassandra.apache.org/) extend a that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"cassandra"`. +`db.system` MUST be set to `"cassandra"` and SHOULD be provided **at span creation time**. ## Attributes @@ -75,6 +75,16 @@ If a parameter has no name and instead is referenced only by index, then `` +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `db.cassandra.consistency_level` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/cosmosdb.md b/docs/database/cosmosdb.md index 6fa48db887..b6f4330082 100644 --- a/docs/database/cosmosdb.md +++ b/docs/database/cosmosdb.md @@ -13,7 +13,7 @@ described on this page. ## Attributes -`db.system` MUST be set to `"cosmosdb"`. +`db.system` MUST be set to `"cosmosdb"` and SHOULD be provided **at span creation time**. Cosmos DB instrumentation includes call-level (public API) surface spans and network spans. Depending on the connection mode (Gateway or Direct), network-level spans may also be created. @@ -76,6 +76,16 @@ If a parameter has no name and instead is referenced only by index, then `` +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `db.cosmosdb.connection_mode` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/couchdb.md b/docs/database/couchdb.md index 3aa0ad625c..7d33f249b4 100644 --- a/docs/database/couchdb.md +++ b/docs/database/couchdb.md @@ -10,7 +10,7 @@ The Semantic Conventions for [CouchDB](https://couchdb.apache.org/) extend and o that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"couchdb"`. +`db.system` MUST be set to `"couchdb"` and SHOULD be provided **at span creation time**. ## Attributes @@ -43,6 +43,14 @@ described on this page. +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/database-spans.md b/docs/database/database-spans.md index 88093233d8..3a38d4def6 100644 --- a/docs/database/database-spans.md +++ b/docs/database/database-spans.md @@ -142,6 +142,17 @@ If a parameter has no name and instead is referenced only by index, then `` +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`db.system`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `db.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/elasticsearch.md b/docs/database/elasticsearch.md index d881f79ace..0c6fcb3c98 100644 --- a/docs/database/elasticsearch.md +++ b/docs/database/elasticsearch.md @@ -10,7 +10,7 @@ The Semantic Conventions for [Elasticsearch](https://www.elastic.co/) extend and that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"elasticsearch"`. +`db.system` MUST be set to `"elasticsearch"` and SHOULD be provided **at span creation time**. ## Span Name @@ -84,6 +84,18 @@ Even though parameterized query text can potentially have sensitive data, by usi +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`http.request.method`](/docs/attributes-registry/http.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) +* [`url.full`](/docs/attributes-registry/url.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/hbase.md b/docs/database/hbase.md index 24573caae9..6e6419862b 100644 --- a/docs/database/hbase.md +++ b/docs/database/hbase.md @@ -10,7 +10,7 @@ The Semantic Conventions for [HBase](https://hbase.apache.org/) extend and overr that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"hbase"`. +`db.system` MUST be set to `"hbase"` and SHOULD be provided **at span creation time**. ## Attributes @@ -50,6 +50,15 @@ For batch operations, if the individual operations are known to have the same op +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/mongodb.md b/docs/database/mongodb.md index 50ff165787..ff3f18891c 100644 --- a/docs/database/mongodb.md +++ b/docs/database/mongodb.md @@ -10,7 +10,7 @@ The Semantic Conventions for [MongoDB](https://www.mongodb.com/) extend and over that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"mongodb"`. +`db.system` MUST be set to `"mongodb"` and SHOULD be provided **at span creation time**. ## Attributes @@ -48,6 +48,15 @@ For batch operations, if the individual operations are known to have the same co +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/mssql.md b/docs/database/mssql.md index 6962223af7..8ea95cbe8a 100644 --- a/docs/database/mssql.md +++ b/docs/database/mssql.md @@ -10,7 +10,7 @@ The Semantic Conventions for the *Microsoft SQL Server* extend and override the that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"mssql"`. +`db.system` MUST be set to `"mssql"` and SHOULD be provided **at span creation time**. ## Attributes @@ -65,6 +65,16 @@ If a parameter has no name and instead is referenced only by index, then `` +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/redis.md b/docs/database/redis.md index 564e119af6..71ebcf9ee9 100644 --- a/docs/database/redis.md +++ b/docs/database/redis.md @@ -10,7 +10,7 @@ The Semantic Conventions for [Redis](https://redis.com/) extend and override the that describe common database operations attributes in addition to the Semantic Conventions described on this page. -`db.system` MUST be set to `"redis"`. +`db.system` MUST be set to `"redis"` and SHOULD be provided **at span creation time**. ## Attributes @@ -63,6 +63,15 @@ If a parameter has no name and instead is referenced only by index, then `` +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.namespace`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/docs/database/sql.md b/docs/database/sql.md index 524c5878bd..7a2b57f3e0 100644 --- a/docs/database/sql.md +++ b/docs/database/sql.md @@ -104,6 +104,15 @@ If a parameter has no name and instead is referenced only by index, then `` +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) +* [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.text`](/docs/attributes-registry/db.md) +* [`server.address`](/docs/attributes-registry/server.md) +* [`server.port`](/docs/attributes-registry/server.md) + `error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. | Value | Description | Stability | diff --git a/model/trace/database.yaml b/model/trace/database.yaml index 8140688b97..36f321d03c 100644 --- a/model/trace/database.yaml +++ b/model/trace/database.yaml @@ -1,10 +1,26 @@ groups: - - id: trace.db.common.query + - id: trace.db.common.minimal extends: attributes.db.client.minimal type: attribute_group brief: This group defines the attributes used to perform database client calls. + attributes: + # TODO: add db.system once https://github.com/open-telemetry/build-tools/issues/192 is possible + # - ref: db.system + # sampling_relevant: true + - ref: db.operation.name + sampling_relevant: true + - ref: server.address + sampling_relevant: true + - ref: server.port + sampling_relevant: true + + - id: trace.db.common.query + extends: trace.db.common.minimal + type: attribute_group + brief: This group defines the attributes used to perform database client calls. attributes: - ref: db.query.text + sampling_relevant: true requirement_level: recommended: > Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes @@ -17,11 +33,12 @@ groups: requirement_level: opt_in - id: trace.db.common.query_and_collection - extends: attributes.db.client.minimal + extends: trace.db.common.minimal type: attribute_group brief: This group defines the attributes used to perform database client calls. attributes: - ref: db.query.text + sampling_relevant: true requirement_level: recommended: > SHOULD be collected by default only if there is sanitization that excludes sensitive information. @@ -29,6 +46,7 @@ groups: - ref: db.query.parameter requirement_level: opt_in - ref: db.collection.name + sampling_relevant: true requirement_level: conditionally_required: > If readily available. The collection name MAY be parsed from the query text, @@ -52,9 +70,11 @@ groups: requirement_level: recommended: if and only if `network.peer.address` is set. - ref: db.system + sampling_relevant: true # TODO: Not adding to the minimal because of https://github.com/open-telemetry/build-tools/issues/192 requirement_level: required - ref: db.namespace + sampling_relevant: true requirement_level: conditionally_required: If available. @@ -72,6 +92,7 @@ groups: Attributes for Microsoft SQL Server attributes: - ref: db.namespace + sampling_relevant: true brief: The name of the database, fully qualified within the server address and port. note: When connecting to a default instance, `db.namespace` SHOULD be set to the name of @@ -88,6 +109,7 @@ groups: Attributes for Cassandra attributes: - ref: db.namespace + sampling_relevant: true brief: The Cassandra keyspace name. note: For commands that switch the keyspace, this SHOULD be set to the target keyspace (even if the command fails). examples: ["mykeyspace"] @@ -112,11 +134,12 @@ groups: recommended: if and only if `network.peer.address` is set. - id: db.hbase type: span - extends: attributes.db.client.minimal + extends: trace.db.common.minimal brief: > Attributes for HBase attributes: - ref: db.namespace + sampling_relevant: true brief: The HBase namespace. requirement_level: conditionally_required: If applicable. @@ -126,6 +149,7 @@ groups: instrumentation SHOULD set `db.namespace` value to `default`. examples: ['mynamespace'] - ref: db.collection.name + sampling_relevant: true brief: The HBase table name. requirement_level: conditionally_required: If applicable. @@ -135,11 +159,12 @@ groups: - id: db.couchdb type: span - extends: attributes.db.client.minimal + extends: trace.db.common.minimal brief: > Attributes for CouchDB attributes: - ref: db.operation.name + sampling_relevant: true brief: > The HTTP method + the target REST route. examples: ['GET /{db}/{docid}'] @@ -150,6 +175,7 @@ groups: (literally, i.e., without replacing the placeholders with concrete values): [`GET /{db}/{docid}`](https://docs.couchdb.org/en/stable/api/document/common.html#get--db-docid). - ref: db.namespace + sampling_relevant: true requirement_level: conditionally_required: If available. note: "" # overriding the base note @@ -161,6 +187,7 @@ groups: Attributes for Redis attributes: - ref: db.namespace + sampling_relevant: true brief: > The index of the database being accessed as used in the [`SELECT` command](https://redis.io/commands/select) (captured as a string). @@ -176,6 +203,7 @@ groups: For commands that switch the database, this SHOULD be set to the target database (even if the command fails). examples: ["0", "1", "15"] - ref: db.query.text + sampling_relevant: true brief: > The full syntax of the Redis CLI command. examples: ["HMSET myhash field1 'Hello' field2 'World'"] @@ -194,21 +222,24 @@ groups: - id: db.mongodb type: span - extends: attributes.db.client.minimal + extends: trace.db.common.minimal brief: > Attributes for MongoDB attributes: - ref: db.operation.name + sampling_relevant: true brief: > The name of the command being executed. note: > See [MongoDB database commands](https://www.mongodb.com/docs/manual/reference/command/). examples: ['findAndModify', 'getMore', 'update'] - ref: db.collection.name + sampling_relevant: true brief: The MongoDB collection being accessed within the database stated in `db.namespace`. requirement_level: required - ref: db.namespace + sampling_relevant: true brief: The MongoDB database name. requirement_level: conditionally_required: If available. @@ -216,11 +247,12 @@ groups: - id: db.elasticsearch type: span - extends: attributes.db.client.minimal + extends: trace.db.common.minimal brief: > Attributes for Elasticsearch attributes: - ref: http.request.method + sampling_relevant: true requirement_level: required - ref: db.operation.name requirement_level: required @@ -229,9 +261,11 @@ groups: (see the [Elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json)). examples: [ 'search', 'ml.close_job', 'cat.aliases' ] - ref: url.full + sampling_relevant: true requirement_level: required examples: [ 'https://localhost:9200/index/_search?q=user.id:kimchy' ] - ref: db.query.text + sampling_relevant: true requirement_level: recommended: > Should be collected by default for search-type queries and only if there is sanitization that excludes @@ -239,15 +273,15 @@ groups: brief: The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. examples: [ '"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"' ] - ref: db.collection.name + sampling_relevant: true requirement_level: recommended brief: The index or data stream against which the query is executed. note: > The query may target multiple indices or data streams, in which case it SHOULD be a comma separated list of those. If the query doesn't target a specific index, this field MUST NOT be set. examples: [ 'my_index', 'index1, index2' ] - - ref: server.address - - ref: server.port - ref: db.namespace + sampling_relevant: true note: > When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Cluster" HTTP response header. brief: The name of the Elasticsearch cluster which the client connects to. @@ -342,6 +376,7 @@ groups: requirement_level: conditionally_required: when available - ref: db.namespace + sampling_relevant: true requirement_level: conditionally_required: If available. note: "" # overriding the base note From f79fe6b61438bfa0804df9c254c402fd0d71f42a Mon Sep 17 00:00:00 2001 From: Daniel Jaglowski Date: Thu, 11 Jul 2024 18:10:55 -0500 Subject: [PATCH 6/6] Add 'log.record.original' attribute (#1217) Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com> Co-authored-by: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> Co-authored-by: Liudmila Molkova --- .chloggen/log-original-body.yaml | 22 ++++++++++++++++++++++ docs/attributes-registry/log.md | 11 +++++++---- docs/general/logs.md | 7 +++++-- model/logs/general.yaml | 2 ++ model/registry/log.yaml | 11 +++++++++++ 5 files changed, 47 insertions(+), 6 deletions(-) create mode 100755 .chloggen/log-original-body.yaml diff --git a/.chloggen/log-original-body.yaml b/.chloggen/log-original-body.yaml new file mode 100755 index 0000000000..858bf4242e --- /dev/null +++ b/.chloggen/log-original-body.yaml @@ -0,0 +1,22 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: enhancement + +# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) +component: log + +# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). +note: Add 'log.record.original' attribute. + +# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. +# The values here must be integers. +issues: [1137] + +# (Optional) One or more lines of additional information to render under the primary note. +# These lines will be padded with 2 spaces and then inserted directly into the document. +# Use pipe (|) for multiline entries. +subtext: diff --git a/docs/attributes-registry/log.md b/docs/attributes-registry/log.md index 837a982cb5..7aeff8c921 100644 --- a/docs/attributes-registry/log.md +++ b/docs/attributes-registry/log.md @@ -40,9 +40,12 @@ Attributes for a file to which log was emitted. This document defines the generic attributes that may be used in any Log Record. -| Attribute | Type | Description | Examples | Stability | -| ---------------- | ------ | ------------------------------------------- | ---------------------------- | ---------------------------------------------------------------- | -| `log.record.uid` | string | A unique identifier for the Log Record. [1] | `01ARZ3NDEKTSV4RRFFQ69G5FAV` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| Attribute | Type | Description | Examples | Stability | +| --------------------- | ------ | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | +| `log.record.original` | string | The complete orignal Log Record. [1] | `77 <86>1 2015-08-06T21:58:59.694Z 192.168.2.133 inactive - - - Something happened`; `[INFO] 8/3/24 12:34:56 Something happened` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `log.record.uid` | string | A unique identifier for the Log Record. [2] | `01ARZ3NDEKTSV4RRFFQ69G5FAV` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -**[1]:** If an id is provided, other log records with the same id will be considered duplicates and can be removed safely. This means, that two distinguishable log records MUST have different values. +**[1]:** This value MAY be added when processing a Log Record which was originally transmitted as a string or equivalent data type AND the Body field of the Log Record does not contain the same value. (e.g. a syslog or a log record read from a file.) + +**[2]:** If an id is provided, other log records with the same id will be considered duplicates and can be removed safely. This means, that two distinguishable log records MUST have different values. The id MAY be an [Universally Unique Lexicographically Sortable Identifier (ULID)](https://github.com/ulid/spec), but other identifiers (e.g. UUID) may be used as needed. diff --git a/docs/general/logs.md b/docs/general/logs.md index 499b719841..b48b379edd 100644 --- a/docs/general/logs.md +++ b/docs/general/logs.md @@ -44,9 +44,12 @@ These attributes may be used for identifying a Log Record. | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| -| [`log.record.uid`](/docs/attributes-registry/log.md) | string | A unique identifier for the Log Record. [1] | `01ARZ3NDEKTSV4RRFFQ69G5FAV` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`log.record.original`](/docs/attributes-registry/log.md) | string | The complete orignal Log Record. [1] | `77 <86>1 2015-08-06T21:58:59.694Z 192.168.2.133 inactive - - - Something happened`; `[INFO] 8/3/24 12:34:56 Something happened` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`log.record.uid`](/docs/attributes-registry/log.md) | string | A unique identifier for the Log Record. [2] | `01ARZ3NDEKTSV4RRFFQ69G5FAV` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -**[1]:** If an id is provided, other log records with the same id will be considered duplicates and can be removed safely. This means, that two distinguishable log records MUST have different values. +**[1]:** This value MAY be added when processing a Log Record which was originally transmitted as a string or equivalent data type AND the Body field of the Log Record does not contain the same value. (e.g. a syslog or a log record read from a file.) + +**[2]:** If an id is provided, other log records with the same id will be considered duplicates and can be removed safely. This means, that two distinguishable log records MUST have different values. The id MAY be an [Universally Unique Lexicographically Sortable Identifier (ULID)](https://github.com/ulid/spec), but other identifiers (e.g. UUID) may be used as needed. diff --git a/model/logs/general.yaml b/model/logs/general.yaml index b4afe2c16c..d4835d6982 100644 --- a/model/logs/general.yaml +++ b/model/logs/general.yaml @@ -6,3 +6,5 @@ groups: attributes: - ref: log.record.uid requirement_level: opt_in + - ref: log.record.original + requirement_level: opt_in diff --git a/model/registry/log.yaml b/model/registry/log.yaml index b26870a696..bc18100451 100644 --- a/model/registry/log.yaml +++ b/model/registry/log.yaml @@ -70,3 +70,14 @@ groups: The id MAY be an [Universally Unique Lexicographically Sortable Identifier (ULID)](https://github.com/ulid/spec), but other identifiers (e.g. UUID) may be used as needed. examples: ["01ARZ3NDEKTSV4RRFFQ69G5FAV"] + - id: original + type: string + stability: experimental + brief: > + The complete orignal Log Record. + note: > + This value MAY be added when processing a Log Record which was originally transmitted as a string or equivalent data type AND + the Body field of the Log Record does not contain the same value. (e.g. a syslog or a log record read from a file.) + examples: + - "77 <86>1 2015-08-06T21:58:59.694Z 192.168.2.133 inactive - - - Something happened" + - "[INFO] 8/3/24 12:34:56 Something happened"