Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove tempo serverless #4599

Merged
merged 11 commits into from
Jan 24, 2025
Prev Previous commit
Next Next commit
Remove serverless from tempo docs
  • Loading branch information
electron0zero committed Jan 23, 2025
commit 6ac2178f148ce8edf2ffd07792dd06596879dda9
36 changes: 0 additions & 36 deletions docs/sources/tempo/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -801,42 +801,6 @@ querier:
# Timeout for search requests
[query_timeout: <duration> | default = 30s]

# NOTE: The Tempo serverless feature is now deprecated and will be removed in an upcoming release.
# A list of external endpoints that the querier will use to offload backend search requests. They must
# take and return the same value as /api/search endpoint on the querier. This is intended to be
# used with serverless technologies for massive parallelization of the search path.
# The default value of "" disables this feature.
[external_endpoints: <list of strings> | default = <empty list>]

# If search_external_endpoints is set then the querier will primarily act as a proxy for whatever serverless backend
# you have configured. This setting allows the operator to have the querier prefer itself for a configurable
# number of subqueries. In the default case of 2 the querier will process up to 2 search requests subqueries before starting
# to reach out to search_external_endpoints.
# Setting this to 0 will disable this feature and the querier will proxy all search subqueries to search_external_endpoints.
[prefer_self: <int> | default = 10 ]

# If set to a non-zero value a second request will be issued at the provided duration. Recommended to
# be set to p99 of external search requests to reduce long tail latency.
# (default: 8s)
[external_hedge_requests_at: <duration>]

# The maximum number of requests to execute when hedging. Requires hedge_requests_at to be set.
# (default: 2)
[external_hedge_requests_up_to: <int>]

# The serverless backend to use. If external_backend is set, then authorization credentials will be provided
# when querying the external endpoints. "google_cloud_run" is the only value supported at this time.
# The default value of "" omits credentials when querying the external backend.
[external_backend: <string> | default = ""]

# Google Cloud Run configuration. Will be used only if the value of external_backend is "google_cloud_run".
google_cloud_run:
# A list of external endpoints that the querier will use to offload backend search requests. They must
# take and return the same value as /api/search endpoint on the querier. This is intended to be
# used with serverless technologies for massive parallelization of the search path.
# The default value of "" disables this feature.
[external_endpoints: <list of strings> | default = <empty list>]

# config of the worker that connects to the query frontend
frontend_worker:

Expand Down
6 changes: 0 additions & 6 deletions docs/sources/tempo/configuration/manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,12 +269,6 @@ metrics_generator_client:
querier:
search:
query_timeout: 30s
prefer_self: 10
external_hedge_requests_at: 8s
external_hedge_requests_up_to: 2
external_backend: ""
google_cloud_run: null
external_endpoints: []
trace_by_id:
query_timeout: 10s
metrics:
Expand Down
85 changes: 0 additions & 85 deletions docs/sources/tempo/operations/backend_search.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,34 +111,6 @@ querier:
max_concurrent_queries: 20
```

With serverless technologies:

{{< admonition type="caution" >}}
The Tempo serverless feature is now deprecated and will be removed in an upcoming release.
{{< /admonition >}}

{{< admonition type="note" >}}
Serverless can be a nice way to reduce cost by using it as spare query capacity.
However, serverless tends to have higher variance then simply allowing the queriers to perform the searches themselves.
{{< /admonition >}}

```yaml
querier:

search:
# A list of endpoints to query. Load will be spread evenly across
# these multiple serverless functions.
external_endpoints:
- https://<serverless endpoint>

# If set to a non-zero value a second request will be issued at the provided duration. Recommended to
# be set to p99 of search requests to reduce long tail latency.
external_hedge_requests_at: 8s

# The maximum number of requests to execute when hedging. Requires hedge_requests_at to be set.
external_hedge_requests_up_to: 2
```

### Query-frontend

[Query frontend]({{< relref "../configuration#query-frontend" >}}) lists all configuration
Expand Down Expand Up @@ -174,48 +146,6 @@ query_frontend:
target_bytes_per_job: 50_000_000
```

### Serverless environment

{{< admonition type="caution" >}}
The Tempo serverless feature is now deprecated and will be removed in an upcoming release.
{{< /admonition >}}

Serverless isn't required, but with larger loads, serverless can be used to reduce costs.
Tempo has support for Google Cloud Run and AWS Lambda.
In both cases, you can use the following
settings to configure Tempo to use a serverless environment:

```yaml
querier:
search:
# A list of external endpoints that the querier will use to offload backend search requests. They must
# take and return the same value as /api/search endpoint on the querier. This is intended to be
# used with serverless technologies for massive parallelization of the search path.
# The default value of "" disables this feature.
[external_endpoints: <list of strings> | default = <empty list>]

# If external_endpoints is set then the querier will primarily act as a proxy for whatever serverless backend
# you have configured. This setting allows the operator to have the querier prefer itself for a configurable
# number of subqueries. In the default case of 2 the querier will process up to 2 search requests subqueries before starting
# to reach out to external_endpoints.
# Setting this to 0 will disable this feature and the querier will proxy all search subqueries to external_endpoints.
[prefer_self: <int> | default = 2 ]

# If set to a non-zero value a second request will be issued at the provided duration. Recommended to
# be set to p99 of external search requests to reduce long tail latency.
# (default: 4s)
[external_hedge_requests_at: <duration>]

# The maximum number of requests to execute when hedging. Requires hedge_requests_at to be set.
# (default: 3)
[external_hedge_requests_up_to: <int>]
```

For cloud-specific details:

- [AWS Lambda]({{< relref "./serverless_aws" >}})
- [Google Cloud Run]({{< relref "./serverless_gcp" >}})

## Settings that are safe to increase without major impact.

Scaling up queriers is a safe way to add more query capacity.
Expand Down Expand Up @@ -334,21 +264,6 @@ This option controls the upper limit on the size of a job, and can be used as a
* In testing at Grafana Labs, 100MB to 200MB is a good range for this configuration, and works across different sizes of clusters.
* We recommend keeping this fixed within the recommended range.

### `querier.search.prefer_self` parameter

{{< admonition type="note" >}}
This configuration only applies to `tempo-serverless`.
{{< /admonition >}}

This setting controls the number of job the querier will process before spilling over the `search_external_endpoints` (tempo-serverless).

#### Guidelines

* In testing at Grafana Labs, serverless suffered from cold starts problems. If your query load is predictable, serverless isn't recommended.
* Increase the value of `prefer_self` if you want to process more jobs in the querier and spill out in extreme cases.
* Setting this to a very big number is as good as turning it off because the querier tries to process all the jobs and it never spills over to serverless.
* If we set this to a low value, we spill more jobs to serverless, even when queriers have capacity to process the job, and due to cold start, query latency increases.

### `querier.frontend_worker.parallelism` parameter

Number of simultaneous queries to process per query-frontend or query-scheduler. This configuration controls the number of concurrent requests per query-frontend a querier process.
Expand Down
90 changes: 0 additions & 90 deletions docs/sources/tempo/operations/serverless_aws.md

This file was deleted.

116 changes: 0 additions & 116 deletions docs/sources/tempo/operations/serverless_gcp.md

This file was deleted.