Autocomplete filtering optimisations #2957

mapno · 2023-09-22T07:23:21Z

Context

Autocomplete filtering is the ability of Tempo to suggest tag values to Grafana, and thus the user, based on conditions already present in the search. Using an example, when typing the query { resource.environment = "prod" && resource.service = | }, Tempo returns possible service names that are only present in production environments.

Previous work

This was originally discussed in #1868, which resulted in a number of PRs (#2253, #2433) that add a new query param q to tag value search, that accepts a TraceQL for filtering down values.

The implementation is based on heavily reusing the TraceQL code for building iterators, fetching results and the collecting only the values for the desired attributes. While this approach works well, it doesn't scale as good as expected in big production clusters.

Design optimisations

As a result of reusing TraceQL code, autocomplete filtering does a lot more work that's necessary for the feature. The TraceQL engine is built around the concept of spansets, which require fetching data to build spans for returning as results to the user. That same logic is used in autocomplete filtering, which then throws away most of the retrieved data.

The autocomplete filtering code path can be optimised to do only the work that's necessary, improving performance.

Create new collectors that gather values for the wanted attribute, instead of building high-level objects (ie. spans, spansets).
- Profiling indicates that a significant amount of time (up to ~40% of the autocomplete subroutine) is spent in functions like runtime.mapassign in batchCollector.KeepGroup() and sync.(*Pool).Get that's used for reusing spans in the spanCollector
Simplify iterator builder logic and create only the required iterators
- Because of the need for building spansets, Tempo retrieves data (eg. span start/end time, ) that is of no use for autocomplete. Such as span.kind when there are no span conditions (because we still need to build spans for the engine).

Other paths

Reduce the number of inspected blocks/data: a config parameter controls how many blocks should be inspected for tag value queries, reducing the amount of data read.
Stop search faster by calculating cardinality of collected values.
- Algorithms like HyperLogLog could be used to quit early if we're not getting new values after a while.
Return after a configured amount of time (eg. 2s) with whatever data it's been fetched. In most use-cases, searching for longer is useless as the user won't wait for the results.

Tasklist

Tasks

Give feedback

Refactor autocomplete filtering internals to read less data and allocate less memory Tag value search improvements #2942
Tempo: Caching of tag values doesn't update with new conditions grafana#77380

area/tracing datasource/Tempo type/bug
Options

The text was updated successfully, but these errors were encountered:

mapno · 2023-11-09T17:09:57Z

Follow-up: #3127

mapno mentioned this issue Sep 22, 2023

Tag value search improvements #2942

Merged

3 tasks

mapno added this to Tempo squad Sep 22, 2023

github-project-automation bot moved this to Todo in Tempo squad Sep 22, 2023

mapno self-assigned this Sep 22, 2023

mapno moved this from Todo to In Progress in Tempo squad Sep 22, 2023

mapno mentioned this issue Oct 3, 2023

TraceQL autocomplete #1868

Closed

16 tasks

mapno mentioned this issue Nov 9, 2023

Autocomplete filtering: Phase 2 #3127

Closed

glamcoder moved this from In Progress to In Review in Tempo squad Nov 21, 2023

mapno closed this as completed in #2942 Nov 23, 2023

github-project-automation bot moved this from In Review to Done in Tempo squad Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autocomplete filtering optimisations #2957

Autocomplete filtering optimisations #2957

mapno commented Sep 22, 2023 •

edited

Loading

Tasks

mapno commented Nov 9, 2023

Autocomplete filtering optimisations #2957

Autocomplete filtering optimisations #2957

Comments

mapno commented Sep 22, 2023 • edited Loading

Context

Previous work

Design optimisations

Other paths

Tasklist

Tasks

mapno commented Nov 9, 2023

mapno commented Sep 22, 2023 •

edited

Loading