Skip to content

Commit

Permalink
[release-v2.7] [DOC] Fix technical debt and improve reading scores (#…
Browse files Browse the repository at this point in the history
…4594)

* Fix technical debt and improve reading scores

* Add linter skip rules for rel notes

* Update the parquet schema

(cherry picked from commit 033c536)

Co-authored-by: Kim Nylander <[email protected]>
  • Loading branch information
github-actions[bot] and knylander-grafana authored Jan 22, 2025
1 parent d5f2073 commit 19c36d1
Show file tree
Hide file tree
Showing 36 changed files with 409 additions and 280 deletions.
12 changes: 6 additions & 6 deletions docs/sources/tempo/api_docs/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -595,7 +595,7 @@ If provided, the tag values returned by the API are filtered to only return valu
Queries can be incomplete: for example, `{ resource.cluster = }`.
Tempo extracts only the valid matchers and builds a valid query.
If an input is invalid, Tempo doesn't provide an error. Instead,
If an input is invalid, Tempo doesn't provide an error. Instead,
you'll see the whole list when a failure of parsing input. This behavior helps with backwards compatibility.
Only queries with a single selector `{}` and AND `&&` operators are supported.
Expand Down Expand Up @@ -670,7 +670,7 @@ For example the following request computes the total number of failed spans over
{{< admonition type="note" >}}
Actual API parameters must be url-encoded. This example is left unencoded for readability.
{{% /admonition %}}
{{< /admonition >}}
```
GET /api/metrics/query?q={status=error}|count_over_time()by(resource.service.name)
Expand All @@ -686,7 +686,7 @@ Returns status code 200 and body `echo` when the query frontend is up and ready
{{< admonition type="note" >}}
Meant to be used in a Query Visualization UI like Grafana to test that the Tempo data source is working.
{{% /admonition %}}
{{< /admonition >}}
### Overrides API
Expand Down Expand Up @@ -717,13 +717,13 @@ ingester service.
{{< admonition type="note" >}}
This is usually used at the time of scaling down a cluster.
{{% /admonition %}}
{{< /admonition >}}
### Usage metrics
{{< admonition type="note" >}}
This endpoint is only available when one or more usage trackers are enabled in [the distributor]({{< relref "../configuration#distributor" >}}).
{{% /admonition %}}
{{< /admonition >}}
```
GET /usage_metrics
Expand All @@ -747,7 +747,7 @@ tempo_usage_tracker_bytes_received_total{service="service-A",tenant="single-tena
{{< admonition type="note" >}}
This endpoint is only available when Tempo is configured with [the global override strategy]({{< relref "../configuration#overrides" >}}).
{{% /admonition %}}
{{< /admonition >}}
```
GET /distributor/ring
Expand Down
4 changes: 2 additions & 2 deletions docs/sources/tempo/api_docs/metrics-summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ weight: 600
{{< admonition type="warning" >}}
The metrics summary API is deprecated as of Tempo 2.7. Features powered by the metrics summary API, like the [Aggregate by table](https://grafana.com/docs/grafana/<GRAFANA_VERSION>/datasources/tempo/query-editor/traceql-search/#optional-use-aggregate-by), are also deprecated in Grafana Cloud and Grafana 11.3 and later.
It will be removed in a future release.
{{% /admonition %}}
{{< /admonition >}}

This document explains how to use the metrics summary API in Tempo.
This API returns RED metrics (span count, erroring span count, and latency information) for `kind=server` spans sent to Tempo in the last hour, grouped by a user-specified attribute.
Expand Down Expand Up @@ -122,7 +122,7 @@ The response is returned as JSON following [standard protobuf->JSON mapping rule

{{< admonition type="note" >}}
The `uint64` fields cannot be fully expressed by JSON numeric values so the fields are serialized as strings.
{{% /admonition %}}
{{< /admonition >}}

Example:

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/configuration/network/ipv6.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Tempo can be configured to communicate between the components using Internet Pro

{{< admonition type="note" >}}
The underlying infrastructure must support this address family. This configuration may be used in a single-stack IPv6 environment, or in a dual-stack environment where both IPv6 and IPv4 are present. In a dual-stack scenario, only one address family may be configured at a time, and all components must be configured for that address family.
{{% /admonition %}}
{{< /admonition >}}

## Protocol configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/configuration/network/tls.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Tempo can be configured to communicate between the components using Transport La

{{< admonition type="note" >}}
The ciphers and TLS version here are for example purposes only. We are not recommending which ciphers or TLS versions for use in production environments.
{{% /admonition %}}
{{< /admonition >}}

## Server configuration

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/configuration/use-trace-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ If you are using Grafana on-prem, you need to [set up the Tempo data source](/do

{{< admonition type="tip" >}}
If you want to explore tracing data in Grafana, try the [Intro to Metrics, Logs, Traces, and Profiling example]({{< relref "../getting-started/docker-example" >}}).
{{% /admonition %}}
{{< /admonition >}}

This video explains how to add data sources, including Loki, Tempo, and Mimir, to Grafana and Grafana Cloud. Tempo data source set up starts at 4:58 in the video.

Expand Down
6 changes: 3 additions & 3 deletions docs/sources/tempo/getting-started/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ create and offload spans.

{{< admonition type="note" >}}
To learn more about instrumentation, read the [Instrument for tracing]({{< relref "./instrumentation" >}}) documentation to learn how to instrument your favorite language for distributed tracing.
{{% /admonition %}}
{{< /admonition >}}

## Pipeline (Grafana Alloy)

Expand All @@ -54,7 +54,7 @@ refer to [Grafana Alloy configuration for tracing]({{< relref "../configuration/
The [OpenTelemetry Collector](https://github.com/open-telemetry/opentelemetry-collector) / [Jaeger Agent](https://www.jaegertracing.io/docs/latest/deployment/) can also be used at the agent layer.
Refer to [this blog post](/blog/2021/04/13/how-to-send-traces-to-grafana-clouds-tempo-service-with-opentelemetry-collector/)
to see how the OpenTelemetry Collector can be used with Tempo.
{{% /admonition %}}
{{< /admonition >}}

## Backend (Tempo)

Expand All @@ -72,7 +72,7 @@ Tempo offers different deployment options, depending upon your needs. Refer to t
{{< admonition type="note" >}}
Grafana Alloy is already set up to use Tempo.
Refer to [Grafana Alloy configuration for tracing](https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/grafana-alloy).
{{% /admonition %}}
{{< /admonition >}}

## Visualization (Grafana)

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/getting-started/instrumentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ information from a client application with minimal manual instrumentation of the

{{< admonition type="note" >}}
Jaeger client libraries have been deprecated. For more information, refer to the [Deprecating Jaeger clients article](https://www.jaegertracing.io/docs/1.50/client-libraries/#deprecating-jaeger-clients). Jaeger now recommends using OpenTelemetry SDKs.
{{% /admonition %}}
{{< /admonition >}}

- [Jaeger Language Specific Instrumentation](https://www.jaegertracing.io/docs/latest/client-libraries/)

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/getting-started/metrics-from-traces.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Span metrics are of particular interest if your system is not monitored with met

{{< admonition type="note" >}}
Metrics generation is disabled by default. Contact Grafana Support to enable metrics generation in your organization.
{{% /admonition %}}
{{< /admonition >}}

After the metrics-generator is enabled in your organization, refer to [Metrics-generator configuration]({{< relref "../configuration" >}}) for information about metrics-generator options.

Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/getting-started/tempo-in-grafana.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ The JSON data can be downloaded via the Tempo API or the [Inspector panel](/docs

{{< admonition type="note" >}}
To perform this action on Grafana 10.1 or later, select a Tempo data source, select **Explore** from the main menu, and then select **Import trace**.
{{% /admonition %}}
{{< /admonition >}}

## Link tracing data with profiles

Expand Down
50 changes: 29 additions & 21 deletions docs/sources/tempo/introduction/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,55 +14,63 @@ weight: 120
# Introduction

A trace represents the whole journey of a request or an action as it moves through all the nodes of a distributed system, especially containerized applications or microservices architectures.
This makes them the ideal observability signal for discovering bottlenecks and interconnection issues.
Traces are the ideal observability signal for discovering bottlenecks and interconnection issues.

Traces are composed of one or more spans.
A span is a unit of work within a trace that has a start time relative to the beginning of the trace, a duration and an operation name for the unit of work.
It usually has a reference to a parent span (unless it's the first span, the root span, in a trace).
A span is a unit of work within a trace that has a start time relative to the beginning of the trace, a duration, and an operation name for the unit of work.
It usually has a reference to a parent span, unless it's the first, or root, span in a trace.
It frequently includes key/value attributes that are relevant to the span itself, for example the HTTP method used in the request, as well as other metadata such as the service name, sub-span events, or links to other spans.

By definition, traces are never complete. You can always push a new batch of spans, even if days have passed since the last one.
By definition, traces are never complete.
You can always push another batch of spans, even if days have passed since the last one.
When receiving a query requesting a stored trace, tracing backends like Tempo find all the spans for that specific trace and collate them into a returned result.
For that reason, issues can arise on retrieval of the trace data if traces are extremely large.
Retrieving trace data can have issues if traces are extremely large.

<!-- Explanation of traces -->
{{< youtube id="ZirbR0ZJIOs" >}}

## Example of traces

Firstly, a user on your website enters their email address into a form to sign up for your mailing list. They click **Enter**. This initial transaction has a trace ID that's subsequently associated with every interaction in the chain of processes within the system.
Firstly, a user on your website enters their email address into a form to sign up for your mailing list.
They click **Enter**. This initial transaction has a trace ID that's subsequently associated with every interaction in the chain of processes within the system.

Next, the user's email address is data that flows through your system.
In a cloud computing world, it's possible that clicking that one button triggers many downstream processes on various microservices operating across many different nodes in your compute infrastructure.
In a cloud computing world, it's possible that clicking that one button triggers many downstream processes on various microservices operating across many different nodes in your compute infrastructure.

As a result, the email address might be sent to a microservice responsible for verification. If the email passes this check, it is then stored in a database.
As a result, the email address goes to a microservice responsible for verification. If the email passes this check, then the database stores the address.

Along the way, an anonymization microservice strips personally identifying data from the address and adds additional metadata before sending it along to a marketing qualifying microservice which determines whether the request was sent from a targeted part of the internet.
Along the way, an anonymizing microservice strips personally identifying data from the address and adds additional metadata before sending it along to a marketing qualifying microservice.
This microservice determines whether the request came from a targeted part of the internet.

Services respond and data flows back from each, sometimes triggering new events across the system. Along the way, logs are written to the nodes on which those services run with a time stamp showing when the info passed through.
Services respond and data flows back from each, sometimes triggering additional events across the system.
Along the way, nodes write logs on which those services run with a time stamp showing when the info passed through.

Finally, the request and response activity ends. No other spans are added to that TraceID.
Finally, the request and response activity end.
No other spans append to that trace ID.

## Traces and trace IDs

Setting up tracing adds an identifier, or trace ID, to all of these events.
The trace ID is generated when the request is initiated and that same trace ID is applied to every single span as the request and response generate activity across the system.
The trace ID generates when the request initiates.
That same trace ID applies to every span as the request and response generate activity across the system.

That trace ID enables one to trace, or follow, a request as it flows from node to node, service to microservice to lambda function to wherever it goes in your chaotic, cloud computing system and back again.
The trace ID lets you trace, or follow, a request as it flows from node to node, service to microservice to lambda function to wherever it goes in your chaotic, cloud computing system and back again.
This is recorded and displayed as spans.

Here's an example showing two pages in Grafana Cloud. The first, on the left (1), shows a query using the **Explore** feature.
In the query results you can see a **traceID** field that was added to an application. That field contains a **Tempo** trace ID.
The second page, on the right (2), uses the same Explore feature to perform a Tempo search using that **trace ID**.
Here's an example showing two pages in Grafana Cloud.
The first, numbered 1, shows a query using the **Explore** feature.
In the query results, you can see a **TraceID** field that was added to an application.
That field contains a **Tempo** trace ID.
The second page, numbered 2, uses the same **Explore** feature to perform a Tempo search using that **TraceID**.
It then shows a set of spans as horizontal bars, each bar denoting a different part of the system.

![Traces example with query results and spans](/static/img/docs/tempo/screenshot-trace-explore-spans-g10.png)

## What are traces used for?

Traces can help you find bottlenecks.
A trace can be visualized to give a graphic representation of how long it takes for each step in the data flow pathway to complete.
It can show where new requests are initiated and end, and how your system responds.
Applications like Grafana can visualize traces to give a graphic representation of how long it takes for each step in the data flow pathway to complete.
It can show where additional requests initiate and end, and how your system responds.
This data helps you locate problem areas, often in places you never would have anticipated or found without this ability to trace the request flow.

<!-- What traces provide that logs and metrics don't -->
Expand All @@ -72,6 +80,6 @@ This data helps you locate problem areas, often in places you never would have a

For more information about traces, refer to:

* [Traces and telemetry]({{< relref "./telemetry" >}})
* [User journeys: How tracing can help you]({{< relref "./solutions-with-traces" >}})
* [Glossary]({{< relref "./glossary" >}})
* [Traces and telemetry](/telemetry)
* [User journeys: How tracing can help you](./solutions-with-traces)
* [Glossary](./glossary)
13 changes: 6 additions & 7 deletions docs/sources/tempo/introduction/solutions-with-traces/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,12 @@ weight: 300

# Use traces to find solutions

Tracing is best used for analyzing the performance of your system, identifying bottlenecks, monitoring latency, and providing a complete picture of how requests are processed.

* Decrease MTTR/MTTI: Tracing helps reduce Mean Time To Repair (MTTR) and Mean Time To Identify (MTTI) by pinpointing exactly where errors or latency are occurring within a transaction across multiple services.
* Optimization of bottlenecks and long-running code: By visualizing the path and duration of requests, tracing can help identify bottleneck operations and long-running pieces of code that could benefit from optimization.
* Metrics generation and RED signals: Tracing can help generate useful metrics related to Request rate, Error rate, and Duration of requests (RED). You can set alerts against these high-level signals to detect problems when they arise.
* Seamless telemetry correlation: Using tracing in conjunction with logs and metrics can help give you a comprehensive view of events over time during an active incident or postmorterm analysis by showing relationships between services and dependencies.
* Monitor compliance with policies: Business policy adherence ensures that services are correctly isolated using generated metrics and generated service graphs.
Tracing is best used for analyzing the performance of your system, identifying bottlenecks, monitoring latency, and providing a complete picture of requests processing.

* Decrease mean time to repair and mean time to identify an issue by pinpointing exactly where errors or latency are occurring within a transaction across multiple services.
* Optimize bottlenecks and long-running code by visualizing the path and duration of requests. Tracing can help identify bottleneck operations and long-running pieces of code that could benefit from optimization.
* Detect issues with generated metrics. Tracing generates metrics related to request rate, error rate, and duration of requests. You can set alerts against these high-level signals to detect problems.
* Seamless telemetry correlation. Use tracing in conjunction with logs and metrics for a comprehensive view of events over time, during an active incident, or for root-cause analysis. Tracing shows relationships between services and dependencies.
* Monitor compliance with policies. Business policy adherence ensures that services are correctly isolated using generated metrics and generated service graphs.

Each use case provides real-world examples, including the background of the use case and how tracing highlighted and helped resolve any issues.
Loading

0 comments on commit 19c36d1

Please sign in to comment.