Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contri svs #4

Closed
wants to merge 133 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
133 commits
Select commit Hold shift + click to select a range
d8f2e04
Add maximum spans per span set (#4383)
carles-grafana Nov 27, 2024
d411704
add email and name when running dependabot serverless vendoring (#4385)
ie-pham Nov 27, 2024
03de0c7
ci: Don't run check drone signature workflow in forks (#4390)
electron0zero Nov 27, 2024
9275d07
[DOC] Fix links for mounting OSS content to GET (#4388)
knylander-grafana Nov 27, 2024
e9ecc3e
Handle invalid TraceQL query filter in tag values v2 disk cache (#4392)
electron0zero Nov 27, 2024
f2f9fc7
Add query-frontend limit for max length of query expression (#4397)
electron0zero Nov 28, 2024
60780f7
Fix Javi in CODEOWNERS (#4399)
electron0zero Nov 28, 2024
b0a06e8
More exemplar fixes (#4404)
mdisibio Dec 3, 2024
b9321f4
Fix metrics queries with unscoped attributes (#4409)
mdisibio Dec 4, 2024
8522bb8
docs: Added explore Traces plugin to Tempo Quick start tutorial (#4377)
Jayclifford345 Dec 4, 2024
df6fe37
distributor: return trace id length when it is invalid (#4407)
carles-grafana Dec 5, 2024
6c9dc98
feat: limit tags and tag values search (#4320)
javiermolinar Dec 5, 2024
7bf4b85
[DOC] Add tail-based sampling doc (#4414)
knylander-grafana Dec 5, 2024
32de56e
[DOC] Add overrides info to tempo-distributed doc (#4415)
knylander-grafana Dec 5, 2024
e617335
Ingester: Rate limit max trace logs and drop too many live traces (#4…
joe-elliott Dec 5, 2024
a170b4a
Fix Grpc streaming handling of Authorization header (#4419)
mdisibio Dec 6, 2024
2fff84a
Disable gRPC compression (#4429)
carles-grafana Dec 9, 2024
1169973
Exit early on autocomplete noops (#4431)
mdisibio Dec 10, 2024
914e8e5
[DOC] Add doc updates for query frontend and limits per spanset (#4421)
knylander-grafana Dec 10, 2024
fe8bb26
[DOC] Updated examples for filtered tag values (#4425)
knylander-grafana Dec 10, 2024
5d9e8a0
Add script to generate manifest with default config (#4430)
carles-grafana Dec 11, 2024
7117de4
Add golangci-lint cache (#4433)
carles-grafana Dec 11, 2024
c75e9f8
Halt compaction if job is lost (#4420)
joe-elliott Dec 11, 2024
84d77d8
Fix typo in README.md (#4439)
deejay1 Dec 11, 2024
b54844c
improve target info performance (#4408)
ie-pham Dec 12, 2024
e51596b
Bump golang.org/x/crypto from 0.27.0 to 0.31.0 in /tools (#4440)
dependabot[bot] Dec 12, 2024
fcd33c7
Bump golang.org/x/crypto in /cmd/tempo-serverless/lambda (#4442)
dependabot[bot] Dec 12, 2024
3eb471f
Bump github.com/minio/minio-go/v7 from 7.0.80 to 7.0.81 (#4402)
dependabot[bot] Dec 12, 2024
3f20e5d
Bump github.com/spf13/viper from 1.18.2 to 1.19.0 (#4308)
dependabot[bot] Dec 12, 2024
a816e7d
Bump google.golang.org/protobuf from 1.34.2 to 1.35.2 (#4337)
dependabot[bot] Dec 12, 2024
4e8a96f
Init memberlist codecs: append to preconfigured codecs if any (#4445)
yvrhdn Dec 13, 2024
90b2ac5
[TraceQL] Improve performance of select() queries (#4438)
mdisibio Dec 13, 2024
ead4ca8
Update Github Actions to Ubuntu 24.04 (#4441)
carles-grafana Dec 16, 2024
a63bb43
Fix comment in manifest doc (#4458)
carles-grafana Dec 16, 2024
f00ed6a
Blocklist update fix (#4446)
mdisibio Dec 17, 2024
e50f5d9
Migrate docker-ci-tools workflow to Github Actions (#4454)
carles-grafana Dec 18, 2024
06487cc
drone: remove ci tools pipelines after migration to GHA (#4461)
carles-grafana Dec 18, 2024
e0e6e6c
update azurite (#4464)
javiermolinar Dec 18, 2024
d02611a
Update tools version to trigger Docker image build (#4462)
carles-grafana Dec 18, 2024
a2039e9
Update tools image in the Makefile (#4468)
carles-grafana Dec 18, 2024
6ffe62c
drone: remove serverless pipeline (#4470)
carles-grafana Dec 19, 2024
3b1e083
update golang.org/x/crypto (#4474)
javiermolinar Dec 19, 2024
fa9b7ad
fix(drone-sig): add missing `contents: read` permission (#4472)
zzehring Dec 19, 2024
0aecee5
break the loop if context is cancelled (#4476)
joe-elliott Dec 19, 2024
2b00b4e
Distributor shim: add test verifying receiver works (including metric…
yvrhdn Dec 19, 2024
e2c7920
Nomad job example (#4469)
bcirh Dec 20, 2024
dbd7505
Update all open-telemetry packages to 0.116.0 (#4466)
yvrhdn Dec 20, 2024
e1a5ee3
Bump github.com/googleapis/gax-go/v2 from 2.13.0 to 2.14.1 (#4489)
dependabot[bot] Dec 24, 2024
deece3c
Bump github.com/alecthomas/kong from 0.8.0 to 1.6.0 (#4449)
dependabot[bot] Dec 24, 2024
e16215c
Replace `cespare/xxhash` with `cespare/xxhash/v2` (#4485)
Juneezee Dec 24, 2024
685a4a6
Update `make docs` procedure (#4490)
github-actions[bot] Dec 27, 2024
82de08f
Update `make docs` procedure (#4498)
github-actions[bot] Dec 30, 2024
51778c9
Remove doc-validator workflow (#4499)
jdbaldry Dec 31, 2024
472a16a
Remove pool goroutines from all components that don't need it (#4484)
joe-elliott Jan 2, 2025
b6bd537
Add Tempo Nomad job example in Monolithic Mode (#4495)
bcirh Jan 2, 2025
7917a8e
Bump github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from 1.2.1 …
dependabot[bot] Jan 2, 2025
17de1e2
Bump github.com/minio/minio-go/v7 from 7.0.80 to 7.0.82 (#4488)
dependabot[bot] Jan 2, 2025
1b56102
Bump anchore/sbom-action from 0.17.8 to 0.17.9 (#4450)
dependabot[bot] Jan 2, 2025
d382b58
Enforce max span attribute size (#4335)
ie-pham Jan 2, 2025
c34cf3a
Bump alpine base image to 3.21 (#4504)
mdisibio Jan 2, 2025
1f40321
Add docs for multitenancy support in the metrics-generator (#4481)
mapno Jan 3, 2025
9e8f582
v2.7.0-rc.0 (#4508)
joe-elliott Jan 3, 2025
48290cb
Bump github.com/jedib0t/go-pretty/v6 from 6.2.4 to 6.6.5 (#4512)
dependabot[bot] Jan 6, 2025
b08688e
Migrate tempo components docker workflow from Drone to GHA (#4501)
carles-grafana Jan 7, 2025
f0c416f
Fix docker workflow (#4518)
carles-grafana Jan 7, 2025
a94680c
[DOC] Add zone-aware ingesters doc (#4486)
knylander-grafana Jan 7, 2025
efc288d
Fix cd-to-dev-env job (#4519)
carles-grafana Jan 7, 2025
0ec6d08
Update architecture.md (#4475)
xogoodnow Jan 7, 2025
39e8cf8
[DOC] Update examples to TraceQL doc (#4471)
knylander-grafana Jan 7, 2025
2f3b304
Fix trailing space in workflow command (#4522)
carles-grafana Jan 8, 2025
7e9ca2b
Fix cd-to-dev-env job volumes (#4526)
carles-grafana Jan 8, 2025
71e8531
Add docker manifest creation in the workflow (#4527)
carles-grafana Jan 8, 2025
e3c2848
Migrate release pipeline from Drone to GHA (#4503)
carles-grafana Jan 9, 2025
0112d87
Fix release workflow conditional so it doesn't run on PR (#4536)
carles-grafana Jan 10, 2025
12102dd
Add make targets for multi-arch tempo docker image (#4535)
mdisibio Jan 10, 2025
e709f8a
[rhythm] Introduce block-builder and kafka ingest path (#4533)
mapno Jan 10, 2025
b6a86e2
[rhythm] Fix ID generator copy bug (#4540)
mapno Jan 13, 2025
f8728d1
Changelog cleanup 2.7.0 (#4542)
joe-elliott Jan 13, 2025
00129a2
changelog 2 (#4545)
joe-elliott Jan 13, 2025
8d2eb8e
]DOC] Tempo 2.7 release notes (#4537)
knylander-grafana Jan 13, 2025
424274a
Bugfix: Default step for gRPC streaming query range queries (#4546)
joe-elliott Jan 13, 2025
52388dd
Delete remaining Drone files to finish migration to GHA (#4552)
carles-grafana Jan 14, 2025
949ad1d
Update _index.md (#4553)
savar Jan 14, 2025
fad4ff7
Update blockbuilder to periodically flush wals and sort traces (#4550)
mdisibio Jan 15, 2025
8d9ab43
Tempo: remove internal error reason for discarded spans (#4554)
joe-elliott Jan 15, 2025
61beae6
[DOC] Fix typo in Upgrade doc (#4555)
knylander-grafana Jan 15, 2025
b7fbc75
[DOC] Update upgrade considerations for 2.7 (#4558)
knylander-grafana Jan 15, 2025
094a9fd
Update tempo operational dashboard for block builder and v2 traces ap…
mdisibio Jan 15, 2025
e20401c
Add doc for max_span_attr_byte and restructure troubleshoot doc (#4551)
knylander-grafana Jan 15, 2025
1e0169f
Bump github.com/pierrec/lz4/v4 from 4.1.21 to 4.1.22 (#4451)
dependabot[bot] Jan 15, 2025
84c6c0e
Bump github.com/parquet-go/parquet-go from 0.23.1-0.20241011155651-64…
dependabot[bot] Jan 15, 2025
7cda43b
Bump github.com/opentracing-contrib/go-grpc from 0.0.0-20210225150812…
dependabot[bot] Jan 15, 2025
0eae105
[Rhythm] Add concurrency to block-builder wal conversion and flushing…
mdisibio Jan 16, 2025
9f224e5
feat: update minio to version 7.0.83 (#4568)
javiermolinar Jan 16, 2025
bf361e1
Fix TraceQL results caching bug for floats ending in .0 (#4539)
carles-grafana Jan 16, 2025
f6519dd
Bump google.golang.org/protobuf from 1.35.2 to 1.36.3 (#4563)
dependabot[bot] Jan 16, 2025
882ffbd
Bump github.com/alicebob/miniredis/v2 from 2.21.0 to 2.34.0 (#4496)
dependabot[bot] Jan 16, 2025
1615340
remove serverless gomod update in dependabot job (#4570)
ie-pham Jan 16, 2025
7bdb61d
Bump google.golang.org/api from 0.211.0 to 0.217.0 (#4562)
dependabot[bot] Jan 16, 2025
c4b5e7d
Fix typo (#4574)
dsotirakis Jan 17, 2025
1f8d337
[Ingester] Create one goroutine per tenant to flush traces to disk (#…
joe-elliott Jan 17, 2025
89b9f7e
[DOC] Add blog link; update instrumentation scope doc (#4569)
knylander-grafana Jan 17, 2025
9acc16d
[Frontend] Two fixes for gRPC query range streaming (#4576)
joe-elliott Jan 17, 2025
c5323bf
Bump github.com/alecthomas/kong from 1.6.0 to 1.6.1 (#4583)
dependabot[bot] Jan 20, 2025
bdff7cc
[rhythm] Handle commit partial errors (#4591)
mapno Jan 21, 2025
14efba0
Use distroless base image for tempo (#4556)
carles-grafana Jan 21, 2025
033c536
[DOC] Fix technical debt and improve reading scores (#4592)
knylander-grafana Jan 22, 2025
e680d6e
[Rhythm] Move group partition lag metric to ingest package, export fr…
mdisibio Jan 22, 2025
91cf82f
Remove variable value lookup based upon non-existent file (#4595)
jdbaldry Jan 22, 2025
7a3b497
remove serverless tests (#4597)
ie-pham Jan 23, 2025
592c984
Bump github.com/minio/minio-go/v7 from 7.0.83 to 7.0.84 (#4586)
dependabot[bot] Jan 23, 2025
00b37df
Bump github.com/prometheus/common from 0.61.0 to 0.62.0 (#4582)
dependabot[bot] Jan 23, 2025
b980aa7
[Rhythm] Block builder performance improvement (#4596)
mdisibio Jan 23, 2025
12ac95d
Bump go.opentelemetry.io/otel from 1.33.0 to 1.34.0 (#4588)
dependabot[bot] Jan 23, 2025
887c66c
Bump github.com/Azure/azure-sdk-for-go/sdk/azcore from 1.16.0 to 1.17…
dependabot[bot] Jan 23, 2025
55c71fa
Ops: Fix envoy writes dash (#4604)
joe-elliott Jan 23, 2025
c73269a
Bump the opentelemetry-collector group across 1 directory with 19 upd…
dependabot[bot] Jan 23, 2025
e22376f
Bump grafana/shared-workflows from dockerhub-login-v1.0.0 to 1.0 (#4579)
dependabot[bot] Jan 23, 2025
e0949df
Bump github.com/jaegertracing/jaeger from 1.63.0 to 1.65.0 (#4564)
dependabot[bot] Jan 23, 2025
4b5e866
Bump go.opentelemetry.io/proto/otlp from 1.4.0 to 1.5.0 (#4584)
dependabot[bot] Jan 23, 2025
f76a321
Bump the opentelemetry-otel group across 1 directory with 5 updates (…
dependabot[bot] Jan 23, 2025
111df90
Bump the opentelemetry-contrib group across 1 directory with 10 updat…
dependabot[bot] Jan 24, 2025
51aca06
Remove tempo serverless (#4599)
electron0zero Jan 24, 2025
362ed5b
vParquet4 wal avoid recomputing dedicated column mappings for every t…
mdisibio Jan 24, 2025
fc89a14
[TraceQL] Fix to put all conditions following a select clause into th…
mdisibio Jan 24, 2025
58b06e8
Fix local make calls (#4609)
carles-grafana Jan 24, 2025
da9b5ff
[DOC] Restructure operations and manage docs (#4598)
knylander-grafana Jan 27, 2025
c40aa48
Bump anchore/sbom-action from 0.17.9 to 0.18.0 (#4613)
dependabot[bot] Jan 27, 2025
b7ace09
Bump actions/stale from 9.0.0 to 9.1.0 (#4614)
dependabot[bot] Jan 27, 2025
c5ec452
Bump github.com/Azure/azure-sdk-for-go/sdk/azidentity from 1.8.0 to 1…
dependabot[bot] Jan 27, 2025
f895567
Bump google.golang.org/api from 0.217.0 to 0.218.0 (#4616)
dependabot[bot] Jan 27, 2025
261b9f3
Bump github.com/twmb/franz-go from 1.18.0 to 1.18.1 (#4615)
dependabot[bot] Jan 27, 2025
1f4edaf
docs: remove mention of serverless from CONTRIBUTING.md
electron0zero Jan 29, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[DOC] Fix links for mounting OSS content to GET (grafana#4388)
* Fix links for mounting OSS content to GET

* Fix ref to refs in frontmatter

* Apply suggestions from code review

Co-authored-by: Jack Baldry <[email protected]>

* Update docs/sources/tempo/traceql/metrics-queries/functions.md

---------

Co-authored-by: Jack Baldry <[email protected]>
  • Loading branch information
knylander-grafana and jdbaldry authored Nov 27, 2024
commit 9275d0788a0e1f3ef2633e2a8f4983ec67c523eb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,17 @@ keywords:
title: Identify bottlenecks and establish SLOs
menuTitle: Identify bottlenecks and establish SLOs
weight: 320
refs:
metrics-generator:
- pattern: /docs/tempo/
destination: https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/
- pattern: /docs/enterprise-traces/
destination: https://grafana.com/docs/enterprise-traces/<ENTERPRISE_TRACES_VERSION>/metrics-generator/
span-metrics:
- pattern: /docs/tempo/
destination: https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/span_metrics/
- pattern: /docs/enterprise-traces/
destination: https://grafana.com/docs/enterprise-traces/<ENTERPRISE_TRACES_VERSION>/metrics-generator/span_metrics/
---

# Identify bottlenecks and establish SLOs
Expand All @@ -19,34 +30,34 @@ Handy Site Corp, a fake website company, runs an ecommerce application that incl

### Define realistic SLOs

Handy Site’s engineers start by establishing service level objectives (SLOs) around latency ensure that customers have a good experience when trying to complete the checkout process.
Handy Site’s engineers start by establishing service level objectives (SLOs) around latency ensure that customers have a good experience when trying to complete the checkout process.
To do this, they use metrics generated from their span data.

Their service level objective should be a realistic target based on previous history during times of normal operation.
Once they've agreed upon their service level objective, they will set up alerts to warn them when they are at risk of failing to meet that objective.
Once they've agreed upon their service level objective, they will set up alerts to warn them when they are at risk of failing to meet that objective.

### Utilize span metrics to define your SLO and SLI

After evaluating options, they decide to use [span metrics](https://grafana.com/docs/tempo/latest/metrics-generator/span_metrics/) as a service level indicator (SLI) to measure SLO compliance.
After evaluating options, they decide to use [span metrics](ref:span-metrics) as a service level indicator (SLI) to measure SLO compliance.

![Metrics generator and exemplars](/media/docs/tempo/intro/traces-metrics-gen-exemplars.png)

Tempo can generate metrics using the [metrics-generator component](https://grafana.com/docs/tempo/latest/metrics-generator/).
Tempo can generate metrics using the [metrics-generator component](ref:metrics-generator).
These metrics are created based on spans from incoming traces and demonstrate immediate usefulness with respect to application flow and overview.
This includes rate, error, and duration (RED) metrics.


Span metrics also make it easy to use exemplars.
An [exemplar](https://grafana.com/docs/grafana/latest/basics/exemplars/) serves as a detailed example of one of the observations aggregated into a metric. An exemplar contains the observed value together with an optional timestamp and arbitrary trace IDs, which are typically used to reference a trace.
Since traces and metrics co-exist in the metrics-generator, exemplars can be automatically added to those metrics, allowing you to quickly jump from a metric showing aggregate latency over time into an individual trace that represents a low, medium, or high latency request. Similarly, you can quickly jump from a metric showing error rate over time into an individual erroring trace.
An [exemplar](https://grafana.com/docs/grafana/<GRAFANA_VERSION>/basics/exemplars/) serves as a detailed example of one of the observations aggregated into a metric. An exemplar contains the observed value together with an optional timestamp and arbitrary trace IDs, which are typically used to reference a trace.
Since traces and metrics co-exist in the metrics-generator, exemplars can be automatically added to those metrics, allowing you to quickly jump from a metric showing aggregate latency over time into an individual trace that represents a low, medium, or high latency request. Similarly, you can quickly jump from a metric showing error rate over time into an individual erroring trace.

### Monitor latency

Handy Site decides they're most interested in monitoring the latency of requests processed by their checkout service and want to set an objective that 99.5% of requests in a given month should complete within 2 seconds.
To define a service level indicator (SLI) that they can use to track their progress against their objective, they use the `traces_spanmetrics_latency` metric with the proper label selectors, such as `service name = checkoutservice`.
The metrics-generator adds a default set of labels to the metrics it generates, including `span_kind` and `status_code`. However, if they were interested in calculating checkout service latency per endpoint or per version of the software, they could change the configuration of the Tempo metrics-generator to add these custom dimensions as labels to their spanmetrics.
The metrics-generator adds a default set of labels to the metrics it generates, including `span_kind` and `status_code`. However, if they were interested in calculating checkout service latency per endpoint or per version of the software, they could change the configuration of the Tempo metrics-generator to add these custom dimensions as labels to their spanmetrics.

With all of this in place, Handy Site now opens the [Grafana SLO](https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/) application and follows the setup flow to establish an [SLI](https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/create/) for their checkout service around the `traces_spanmetrics_latency` metric..
With all of this in place, Handy Site now opens the [Grafana SLO](https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/) application and follows the setup flow to establish an [SLI](https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/create/) for their checkout service around the `traces_spanmetrics_latency` metric.
They can now be alerted to degradations in service quality that directly impact their end user experience. SLO-based alerting also ensures that they don't suffer from noisy alerts. Alerts are only triggered when the value of the SLI is such that the team is in danger of missing their SLO.

![Latency SLO dashboard](/media/docs/tempo/intro/traces-metrics-gen-SLO.png)
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@ keywords:
title: Diagnose errors with traces
menuTitle: Diagnose errors with traces
weight: 400
refs:
traceql:
- pattern: /docs/tempo/
destination: https://grafana.com/docs/tempo/<TEMPO_VERSION>/traceql/
- pattern: /docs/enterprise-traces/
destination: https://grafana.com/docs/enterprise-traces/<ENTERPRISE_TRACES_VERSION>/traceql/
---

# Diagnose errors with traces
Expand All @@ -27,7 +33,7 @@ It’s imperative for the operations team at Handy Site to quickly troubleshoot

## Use TraceQL to query data

Tempo has a traces-first query language, [TraceQL](https://grafana.com/docs/tempo/latest/traceql/), that provides a unique toolset for selecting and searching tracing data. TraceQL can match traces based on span and resource attributes, duration, and ancestor<>descendant relationships. It also can compute aggregate statistics (e.g., `rate`) over a set of spans.
Tempo has a traces-first query language, [TraceQL](ref:traceql), that provides a unique toolset for selecting and searching tracing data. TraceQL can match traces based on span and resource attributes, duration, and ancestor<>descendant relationships. It also can compute aggregate statistics (e.g., `rate`) over a set of spans.

Handy Site’s services and applications are instrumented for tracing, so they can use TraceQL as a debugging tool. Using three TraceQL queries, the team identifies and validates the root cause of the issue.

Expand All @@ -50,7 +56,7 @@ Looking at the set of returned spans, the most concerning ones are those with th

The team decides to use structural operators to follow an error chain from the top-level `mythical-requester` service to any descendant spans that also have an error status.
Descendant spans can be any span that's descended from the parent span, such as a child or a further child at any depth.
Using this query, the team can pinpoint the downstream service that might be causing the issue. The query below says "Find me spans where `status = error` that that are descendants of spans from the `mythical-requester` service that have status code `500`."
Using this query, the team can pinpoint the downstream service that might be causing the issue. The query below says "Find me spans where `status = error` that that are descendants of spans from the `mythical-requester` service that have status code `500`."

```traceql
{ resource.service.name = "mythical-requester" && span.http.status_code = 500 } >> { status = error }
Expand All @@ -68,14 +74,14 @@ Specifically, the service is passing a `null` value for a column in a database t
After identifying the specific cause of this internal server error,
the team wants to know if there are errors in any database operations other than the `null` `INSERT` error found above.
Their updated query uses a negated regular expression to find any spans where the database statement either doesn’t exist, or doesn’t start with an `INSERT` clause.
This should expose any other issues causing an internal server error and filter out the class of issues that they already diagnosed.
This should expose any other issues causing an internal server error and filter out the class of issues that they already diagnosed.

```traceql
{ resource.service.name = "mythical-requester" && span.http.status_code = 500 } >> { status = error && span.db.statement !~ "INSERT.*" }
```

This query yields no results, suggesting that the root cause of the issues the operations team are seeing is exclusively due to the failing database `INSERT` statement.
At this point, they can roll back to a known working version of the service, or deploy a fix to ensure that `null` data being passed to the service is rejected appropriately.
Once that is complete, the issue can be marked resolved and the Handy team's error rate SLI should return back to acceptable levels.
Once that is complete, the issue can be marked resolved and the Handy team's error rate SLI should return back to acceptable levels.

![Empty query results](/media/docs/tempo/intro/traceql-no-results-handy-site.png)
4 changes: 2 additions & 2 deletions docs/sources/tempo/metrics-generator/active-series.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,9 @@ These capabilities rely on a set of generated span metrics and service metrics.

Any spans that are ingested by Tempo can create many metric series. However, this doesn't mean that every time a span is ingested that a new active series is created.

The number of active series generated depends on the label pairs generated from span data that are associated with the metrics, similar to other Prometheus-formated data.
The number of active series generated depends on the label pairs generated from span data that are associated with the metrics, similar to other Prometheus-formatted data.

For additional information, refer to the [Active series and DPM documentation](/docs/grafana-cloud/billing-and-usage/active-series-and-dpm/#active-series).
For additional information, refer to the [Active series and DPM documentation](https://grafana.com/docs/grafana-cloud/billing-and-usage/active-series-and-dpm/).

## Active series calculation

Expand Down
21 changes: 16 additions & 5 deletions docs/sources/tempo/metrics-generator/service-graph-view.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,17 @@ description: Grafana's service graph view utilizes metrics generated by the metr
aliases:
- ./app-performance-mgmt # /docs/tempo/<TEMPO_VERSION>/metrics-generator/app-performance-mgmt
weight: 400
refs:
enable-service-graphs:
- pattern: /docs/tempo/
destination: https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/service_graphs/enable-service-graphs/
- pattern: /docs/enterprise-traces/
destination: https://grafana.com/docs/enterprise-traces/<ENTERPRISE_TRACES_VERSION>/metrics-generator/service_graphs/enable-service-graphs/
span-metrics:
- pattern: /docs/tempo/
destination: https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/span_metrics/
- pattern: /docs/enterprise-traces/
destination: https://grafana.com/docs/enterprise-traces/<ENTERPRISE_TRACES_VERSION>/metrics-generator/span_metrics/
---

# Service graph view
Expand All @@ -27,13 +38,13 @@ You have to enable span metrics and service graph generation on the Grafana back

To use the service graph view, you need:

* Tempo or Grafana Cloud Traces with either 1) the metrics generator enabled and configured or 2) Grafana Agent or Grafana Alloy enabled and configured to send data to a Prometheus-compatible metrics store
* [Services graphs]({{< relref "../metrics-generator/service_graphs/enable-service-graphs" >}}), which are enabled by default in Grafana
* [Span metrics]({{< relref "../metrics-generator/span_metrics#how-to-run" >}}) enabled in your Tempo data source configuration
* Tempo or Grafana Cloud Traces with either the metrics generator enabled and configured or Grafana Agent or Grafana Alloy enabled and configured to send data to a Prometheus-compatible metrics store
* [Services graphs](ref:enable-service-graphs), which are enabled by default in Grafana
* [Span metrics](ref:span-metrics) enabled in your Tempo data source configuration

The service graph view can be derived from metrics generated by either the metrics-generator or by Grafana Agent or Grafana Alloy.

For information on how to configure these features, refer to the [Grafana Tempo data sources documentation](/docs/grafana/latest/datasources/tempo/).
For information on how to configure these features, refer to the [Tempo data sources documentation](/docs/grafana/<GRAFANA_VERSION>/datasources/tempo/).

## What does the service graph view show?

Expand All @@ -46,7 +57,7 @@ The service graph view provides a span metrics visualization (table) and service
You can select any information in the table that has an underline to show more detailed information.
You can also select any node in the service graph to display additional information.

![Service graph with extended informaiton](/media/docs/grafana/data-sources/tempo/query-editor/tempo-ds-query-service-graph-prom.png)
![Service graph with extended information](/media/docs/grafana/data-sources/tempo/query-editor/tempo-ds-query-service-graph-prom.png)

### Error rate example

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,32 +5,38 @@ aliases:
title: Enable service graphs
description: Learn how to enable service graphs
weight: 200
refs:
cardinality:
- pattern: /docs/tempo/
destination: https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/cardinality/
- pattern: /docs/enterprise-traces/
destination: https://grafana.com/docs/enterprise-traces/<ENTERPRISE_TRACES_VERSION>/metrics-generator/cardinality/
---

## Enable service graphs

Service graphs are generated in Tempo and pushed to a metrics storage.
Then, they can be represented in Grafana as a graph.
You will need those components to fully use service graphs.
You need those components to fully use service graphs.

{{< admonition type="note" >}}
Cardinality can pose a problem when you have lots of services.
To learn more about cardinality and how to perform a dry run of the metrics generator, see the [Cardinality documentation]({{< relref "../cardinality" >}}).
{{% /admonition %}}
To learn more about cardinality and how to perform a dry run of the metrics-generator, refer to the [Cardinality documentation](ref:cardinality).
{{< /admonition >}}

### Enable service graphs in Tempo/GET

To enable service graphs in Tempo/GET, enable the metrics generator and add an overrides section which enables the `service-graphs` generator.
For more information, refer to the [configuration details]({{< relref "../../configuration#metrics-generator" >}}).
For more information, refer to the [configuration details](https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration#metrics-generator).

To enable service graphs when using Grafana Agent, refer to the [Grafana Agent and service graphs documentation]({{< relref "../../configuration/grafana-agent/service-graphs" >}}).
To enable service graphs when using Grafana Alloy, refer to the [Grafana Alloy and service graphs documentation](https://grafana.com/docs/tempo/<TEMPO_VERSION>/configuration/grafana-alloy/service-graphs/).

### Enable service graphs in Grafana

{{< admonition type="note" >}}
Since Grafana 9.0.4, service graphs have been enabled by default. Prior to Grafana 9.0.4, service graphs were hidden
under the [feature toggle](/docs/grafana/latest/setup-grafana/configure-grafana/#feature_toggles) `tempoServiceGraph`.
{{% /admonition %}}
Service graphs are enabled by default in Grafana. Prior to Grafana 9.0.4, service graphs were hidden
under the [feature toggle](/docs/grafana/latest/setup-grafana/configure-grafana) `tempoServiceGraph`.
{{< /admonition >}}

Configure a Tempo data source's service graphs by linking to the Prometheus backend where metrics are being sent:

Expand Down
Loading