Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Template types: define type-safe examples #1740

Open
jerbly opened this issue Jan 13, 2025 · 3 comments
Open

Template types: define type-safe examples #1740

jerbly opened this issue Jan 13, 2025 · 3 comments
Assignees
Labels
schema Semantic Conventions schema definition tooling Regarding build, workflows, build-tools, ...

Comments

@jerbly
Copy link
Contributor

jerbly commented Jan 13, 2025

There are real-world examples of template types in use that already create non-compliant output. For example http.request.header.KEY is defined with type: template[string[]]. However, it is common to see a single string rather than an array of strings in the data sent out. There are also examples already defined that suggest a mixed-type for the dictionary values would be preferable.

Secondly, the definition of template type examples (with or without a mixed-type) is not strongly-typed. This makes it hard/impossible to check with weaver and therefore ensure correctness.

Here are some examples from the current semconv definitions:

Mismatch key:

      - id: rpc.connect_rpc.request.metadata
        type: template[string[]]
        examples: ['rpc.request.metadata.my-custom-metadata-attribute=["1.2.3.4", "1.2.3.5"]']

Missing keys:

      - id: db.operation.parameter
        type: template[string]
        examples: ["someval", "55"]

Mixed types:

      - id: db.elasticsearch.path_parts
        type: template[string]
        examples:
          [
            "db.elasticsearch.path_parts.index=test-index",
            "db.elasticsearch.path_parts.doc_id=123",
          ]

Suggestions:

Allow mixed type values within template types. Either remove the strict typing:

    template[string]
    template[int]
    template[boolean]
    template[double]
    template[string[]]
    template[int[]]
    template[boolean[]]
    template[double[]]

Replaced with just template.

Or, add a mixed-type definition: template[any]

Make examples type and correctness checkable:

groups:
  - id: rpc.connect_rpc.request.metadata
    type: template[string[]]
    # examples: ['rpc.request.metadata.my-custom-metadata-attribute=["1.2.3.4", "1.2.3.5"]']
    examples:
      - key: "my-custom-metadata-attribute"
        value: ["1.2.3.4", "1.2.3.5"]
  - id: db.elasticsearch.path_parts
    #type: template[string]
    type: template[any]
    # examples:
    #   [
    #     "db.elasticsearch.path_parts.index=test-index",
    #     "db.elasticsearch.path_parts.doc_id=123",
    #   ]
    examples:
      - key: "index"
        value: "test-index"
      - key: "doc_id"
        value: 123

Note: I've included a mixed type case here
The generated documentation can combine the key with the id preventing mistakes as seen in the rpc example above.

@jerbly jerbly changed the title Proposal: Template types - allow mixed types and define examples in type-safe way Proposal: Template types - allow mixed types and define type-safe examples Jan 13, 2025
@lmolkova
Copy link
Contributor

lmolkova commented Jan 13, 2025

Thanks for digging into this!

I do agree on the examples side - template examples are not strongly typed and are confusing.
Template attribute is effectively a flattened map, but I believe we do want the values to be strongly typed.

E.g. http.request.header.baggage can only be recorded as an array of strings (since HTTP supports multiple headers with the same name and baggage is one of those potentially-multi-value headers).

If someone reports Content-Length as a single value http.request.header.content-length="42" or as an array of ints http.request.header.content-length=[42], it's a bug on the instrumentation side, not in the semantic conventions.

From schema perspective, I wonder if we should tackle template attributes as a part of #1669 - we could as well record them as

- id: http.request.headers
  type: map[string, string[]]

and it would be exported as

{
  ...
  "attributes" [
    "http.request.headers": {
        "foo": ["bar"],
        "baz": ["...", "..."]
    }
  ]
  ...
}

I believe we don't have full consensus on using complex attribute types on spans, so the actionable part seem to be improving examples.

However, it is common to see a single string rather than an array of strings in the data sent out.

Does this data have any indication of where it came from in the instrumentation scope? Maybe we can create issues for those instrumentations to fix it?

@lmolkova lmolkova added schema Semantic Conventions schema definition tooling Regarding build, workflows, build-tools, ... labels Jan 13, 2025
@jerbly
Copy link
Contributor Author

jerbly commented Jan 14, 2025

@lmolkova - You are right of course. Turns out the non-compliant examples I found for http.request.header.KEY were in telemetry generated in a common library within our company. 😊 - I should have dug deeper before reaching that conclusion. I will get our own house in order!

However, there are examples of mixed-types in the semconvs examples today (non-stable). Also, your example http.request.header.content-length=["42"], seems a shame that 42 is not an int. If it was a number type, downstream tooling could report on content lengths over or under a threshold, or find summary statistics over a time period.

Anyway, as you say, the actionable part of this issue is the examples definitions. Does the key-value list look suitable?

@lmolkova lmolkova moved this to Clean up YAML schema in Semantic Conventions Tooling Jan 15, 2025
@jerbly jerbly changed the title Proposal: Template types - allow mixed types and define type-safe examples Template types: define type-safe examples Jan 24, 2025
@jerbly
Copy link
Contributor Author

jerbly commented Jan 24, 2025

Changed the title of this issue to better represent the task following the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema Semantic Conventions schema definition tooling Regarding build, workflows, build-tools, ...
Projects
Status: Clean up YAML schema
Development

No branches or pull requests

2 participants