Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance semantic conventions for HTTP #263

Merged
merged 38 commits into from
Oct 28, 2019
Merged
Changes from 9 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
dee825c
Update HTTP conventions.
Oberon00 Sep 27, 2019
6d4ed7f
Improve HTTP, fix references to peer.*.
Oberon00 Sep 27, 2019
b894d0b
Wording.
Oberon00 Sep 27, 2019
5621102
typo a/an http
Oberon00 Sep 27, 2019
417f488
host.name/host.port
Oberon00 Sep 27, 2019
b6e79ba
Clarify server_name.
Oberon00 Sep 27, 2019
a452423
Typo, missing 'has'.
Oberon00 Sep 27, 2019
ab163c9
Typo nginx link.
Oberon00 Sep 27, 2019
6be7dc1
Wording.
Oberon00 Sep 27, 2019
af08156
Typo in span name convention.
Oberon00 Sep 28, 2019
d362e39
Wording for common HTTP intro.
Oberon00 Sep 30, 2019
f4d69c2
Make http.flavor non-required.
Oberon00 Sep 30, 2019
673635c
Clarify client's http.url.
Oberon00 Sep 30, 2019
471aca3
HTTP server span name: reference `http.app_root`.
Oberon00 Sep 30, 2019
62361d2
Split "Definitions" from conventions, clarify app_root.
Oberon00 Sep 30, 2019
6ab5dc9
Qualify order of http server attr preferences.
Oberon00 Sep 30, 2019
1f796f3
Address review comments.
Oberon00 Oct 2, 2019
a9341a9
Typo.
Oberon00 Oct 2, 2019
62a326c
Make http.status_code conditionally required.
Oberon00 Oct 4, 2019
868eaf7
Move http.host,target,scheme; clarify empty host.
Oberon00 Oct 4, 2019
474ebaf
Fix misplaced paragraph.
Oberon00 Oct 4, 2019
8699757
Fix client host/port requirement.
Oberon00 Oct 4, 2019
fd56d2c
Fix HTTP status code OC incompat annotations.
Oberon00 Oct 7, 2019
7355684
Merge branch 'master' into httpconv
Oberon00 Oct 7, 2019
e8d4b81
Merge branch 'master' into httpconv
SergeyKanzhelev Oct 15, 2019
fcdaead
Fix incomplete sentence.
Oberon00 Oct 16, 2019
b51463f
Markdown syntax.
Oberon00 Oct 16, 2019
119509e
Update HTTP example (remove URL, add client).
Oberon00 Oct 21, 2019
5f5c4b6
Merge branch 'master' into httpconv
Oberon00 Oct 23, 2019
a432832
Fix markdownlint.
Oberon00 Oct 23, 2019
291ba38
Typo.
Oberon00 Oct 23, 2019
9364357
Remove http.app Span attribute.
Oberon00 Oct 24, 2019
280bd01
Fix lint.
Oberon00 Oct 24, 2019
c59d280
Merge branch 'master' into httpconv
SergeyKanzhelev Oct 26, 2019
8a25ef4
Add note about n:1 server_name + app_root => app.
Oberon00 Oct 28, 2019
3f055cb
Typo.
Oberon00 Oct 28, 2019
2e5c5eb
Merge branch 'master' into httpconv
SergeyKanzhelev Oct 28, 2019
a8685ca
Remove http.app_root. (#4)
Oberon00 Oct 28, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 80 additions & 26 deletions specification/data-semantic-conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,49 +14,103 @@ This way, the operator will not need to learn specifics of a language and
telemetry collected from multi-language micro-service can still be easily
correlated and cross-analyzed.

## HTTP client

This span type represents an outbound HTTP request.

For a HTTP client span, `SpanKind` MUST be `Client`.
## HTTP

Given an [RFC 3986](https://www.ietf.org/rfc/rfc3986.txt) compliant URI of the form
`scheme:[//authority]path[?query][#fragment]`, the span name of the span SHOULD
be set to to the URI path value.
These span types represent HTTP requests. They can be used for http and https
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
schemes and various HTTP versions like 1.1, 2 and SPDY.

If a framework can identify a value that represents the identity of the request
and has a lower cardinality than the URI path, this value MUST be used for the span name instead.
Given an [RFC 3986](https://tools.ietf.org/html/rfc3986) compliant URI of the form
`scheme:[//host[:port]]path[?query][#fragment]`, the span name of the span SHOULD
be set to to the URI path value, unless another value that represents the identity
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
of the request and has a lower cardinality can be identified.
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

| Attribute name | Notes and examples | Required? |
| :------------- | :----------------------------------------------------------- | --------- |
| `component` | Denotes the type of the span and needs to be `"http"`. | Yes |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `http.method` | HTTP request method. E.g. `"GET"`. | Yes |
| `http.url` | HTTP URL of this request, represented as `scheme://host:port/path?query#fragment` E.g. `"https://example.com:779/path/12314/?q=ddds#123"`. | Yes |
| `http.status_code` | [HTTP response status code](https://tools.ietf.org/html/rfc7231). E.g. `200` (integer) | No |
| `http.status_text` | [HTTP reason phrase](https://www.ietf.org/rfc/rfc2616.txt). E.g. `"OK"` | No |
| `http.url` | Full HTTP request URL in the form `scheme://host[:port]/path?query[#fragment]`. Usually the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless. | Defined later. |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `http.status_code` | [HTTP response status code][]. E.g. `200` (integer) | No |
| `http.status_text` | [HTTP reason phrase][]. E.g. `"OK"` | No |
| `http.flavor` | Kind of HTTP protocol used: `"1.0"`, `"1.1"`, `"2"`, `"SPDY"` or `"QUIC"`. | If not TCP-based (`QUIC`). |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

## HTTP server
It is recommended to also use the `peer.*` attributes, especially `peer.ip*`.
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

This span type represents an inbound HTTP request.
[HTTP response status code]: https://tools.ietf.org/html/rfc7231#section-6
[HTTP reason phrase]: https://tools.ietf.org/html/rfc7230#section-3.1.2

### HTTP client

This span type represents an outbound HTTP request.

For a HTTP server span, `SpanKind` MUST be `Server`.
For an HTTP client span, `SpanKind` MUST be `Client`.
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

Given an inbound request for a route (e.g. `"/users/:userID?"` the `name`
attribute of the span SHOULD be set to this route.
`http.url` is required and represents the HTTP URL used to (initially) make this request.
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

If the route can not be determined, the `name` attribute MUST be set to the [RFC 3986 URI](https://www.ietf.org/rfc/rfc3986.txt) path value.
### HTTP server

If a framework can identify a value that represents the identity of the request
and has a lower cardinality than the URI path or route, this value MUST be used for the span name instead.
This span type represents an inbound HTTP request.

For an HTTP server span, `SpanKind` MUST be `Server`.

Given an inbound request for a route (e.g. `"/users/:userID?"` the `name` attribute of the span SHOULD be set to this route. If the route does not include the application root path, it SHOULD be prepended to the span name.
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

If the route cannot be determined, the `name` attribute MUST be set as defined in the general semantic conventions for HTTP.
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

| Attribute name | Notes and examples | Required? |
| :------------- | :----------------------------------------------------------- | --------- |
| `component` | Denotes the type of the span and needs to be `"http"`. | Yes |
| `http.method` | HTTP request method. E.g. `"GET"`. | Yes |
| `http.url` | HTTP URL of this request, represented as `scheme://host:port/path?query#fragment` E.g. `"https://example.com:779/path/12314/?q=ddds#123"`. | Yes |
| `http.route` | The matched route. E.g. `"/users/:userID?"`. | No |
| `http.status_code` | [HTTP response status code](https://tools.ietf.org/html/rfc7231). E.g. `200` (integer) | No |
| `http.status_text` | [HTTP reason phrase](https://www.ietf.org/rfc/rfc2616.txt). E.g. `"OK"` | No |
| `http.target` | The full request target as passed in a [HTTP request line][] or equivalent, e.g. `/path/12314/?q=ddds#123"`. | [1] |
| `http.host` | The value of the [HTTP host header][]. Note that this might be empty or not present. | [1] |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `http.scheme` | The URI scheme identifying the used protocol: `"http"` or `"https"` | [1] |
| `http.server_name` | The (primary) server name (usually not including a port). This should be obtained via configuration, e.g. the Apache [`ServerName`][ap-sn] or NGINX [`server_name`][nx-sn] directive. If no such configuration can be obtained, this attribute MUST NOT be set ( `host.name` should be used instead). | [1] |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `host.name` | Analogous to `peer.hostname` but for the host instead of the peer. | [1] |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `host.port` | Local port. E.g., `80` (integer). Analogous to `peer.port`. | [1] |
| `http.route` | The matched route (path template). E.g. `"/users/:userID?"`. | No |
| `http.app` | An identifier for the whole HTTP application. E.g. Flask app name, `spring.application.name`, etc. | No |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to keep it in resource API, not in individual requests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand resources, they are process-wide. However, a process can host multiple apps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #274 "Allow resources as span attributes"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by this multiple "applications". What do you call an "application"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the "Definitions" section above is helpful. But you may also consider the examples. The details depend on the technology. To add another example: In Java EE, an app is the entity described by the "web.xml" file and is defined as follows:

A Web application is a collection of servlets, HTML pages, classes, and other resources that make up a complete application on a Web server.

(quote from SRV.9 of https://jcp.org/aboutJava/communityprocess/maintenance/jsr053/index2.html)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decided that only one http app per process is supported, then the app would probably be already covered by the proposed service.name resource (https://github.com/open-telemetry/opentelemetry-specification/pull/303/files). But note that at the time I wrote this PR, there were no semantic conventions on resources at all, and I still think that resources are very vaguely specified, so it's hard for me to say whether something could be a resource or not, when I don't really know what a resource actually is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that Resource is not process-wide concept. It's per-Tracer concept. Here it says:

When used with distributed tracing, a resource can be associated with the TracerSdk. When associated with TracerSdk, all Spans produced by the Tracer, that is implemented by this TracerSdk, will automatically be associated with this Resource.

So for the case of multiple apps inside the process - each app may initialize it's own Tracer with the app-specific properties. Same properties may be interesting for the child spans so ideally this Tracer needs to be shared with all libraries used in this app.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Don't forget that the Meter API will need equal treatment as far as resources go.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will look into that! This also seems like something that needs to be reworded with named tracers (probably the Resources would be per TracerFactory).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove http.app for now and created #335 to track that.

| `http.app_root` |The path prefix of the URL that identifies this `http.app`. Also known as "context root". If multiple roots exist, the one that was matched for this request should be used. | No |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `http.client_ip` | The IP address of the original client behind all proxies, if known (e.g. from [X-Forwarded-For][]). | No |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved

[HTTP request line]: https://tools.ietf.org/html/rfc7230#section-3.1.1
[HTTP host header]: https://tools.ietf.org/html/rfc7230#section-5.4
[X-Forwarded-For]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For
[ap-sn]: https://httpd.apache.org/docs/2.4/mod/core.html#servername
[nx-sn]: http://nginx.org/en/docs/http/ngx_http_core_module.html#server_name

**[1]**: `http.url` is usually not readily available on the server side but would have to be assembled in a cumbersome and sometimes lossy process from other information (see e.g. <https://github.com/open-telemetry/opentelemetry-python/pull/148>).
It is thus preferred to supply the raw data that *is* available.
Namely, one of the following sets is required (in order of preference, all strings must be non-empty):

* `http.scheme`, `http.host`, `http.target`
* `http.scheme`, `http.server_name`, `host.port`, `http.target`
* `http.scheme`, `host.name`, `host.port`, `http.target`
* `http.url`

Of course, more than the required attributes can be supplied, but this is recommended only if they cannot be inferred from the sent ones.
For example, `http.server_name` has shown great value in practice, as bogus HTTP Host headers occur often in the wild.

It is strongly recommended to set at least one of `http.app` or `http.server_name` to allow associating requests with some logical app or server entity.

As an example, if a browser request for `https://example.com:8080/webshop/articles/4?s=1` is invoked, we may have:

Span name: `/webshop/articles/:article_id` (`app_root` + `route`).

| Attribute name | Value |
| :----------------- | :-------------------------------------------------------------------------------- |
| `component` | `"http"` |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `http.method` | `"GET"` |
| `http.url` | `"https://example.com:8080/webshop/articles/4?s=1"` (or not set) |
| `http.target` | `"/webshop/articles/4?s=1"` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this example is not how span will look like as you either have url or target, correct? Maybe it's worth splitting into two examples to avoid confusion

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll do that. Raises the priority of #311 though 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only included a remark about the http.url value below the attribute table and removed the attribute from the table. I also added a client-side example.

| `http.host` | `"example.com:8080"` |
| `http.server_name` | `"example.com"` |
| `host.port` | `8080` |
| `http.scheme` | `"https"` |
| `http.route` | `"/articles/:article_id"` (note that the `app_root` part is missing in this case) |
| `http.status_code` | `200` |
| `http.status_text` | `"OK"` |
Oberon00 marked this conversation as resolved.
Show resolved Hide resolved
| `http.app` | E.g., `"My cool WebShop"` or `"com.example.webshop"` |
| `http.app_root` | `"/webshop"` |
| `http.client_ip` | `"192.0.2.4"` |
| `peer.ip4` | `"192.0.2.5"` (the client goes through a proxy) |

## Databases client calls

Expand Down