Skip to content

Commit

Permalink
Make vParquet2 the default block format (#2526)
Browse files Browse the repository at this point in the history
* Make vParquet2 the default encoding

* Update docs to mention vParquet2 as default

* Regenerate manifest.md

* Update CHANGELOG.md

* Use vParquet2 in tests
  • Loading branch information
stoewer authored Jun 1, 2023
1 parent 767115d commit 05aad36
Show file tree
Hide file tree
Showing 7 changed files with 223 additions and 134 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
## main / unreleased

* [CHANGE] Make vParquet2 the default block format [#2526](https://github.com/grafana/tempo/pull/2526) (@stoewer)
* [CHANGE] Disable tempo-query by default in Jsonnet libs. [#2462](https://github.com/grafana/tempo/pull/2462) (@electron0zero)
* [ENHANCEMENT] Fill parent ID column and nested set columns [#2487](https://github.com/grafana/tempo/pull/2487) (@stoewer)
* [CHANGE] Prefix service graph extra dimensions labels with `server_` and `client_` if `enable_client_server_prefix` is enabled [#2335](https://github.com/grafana/tempo/pull/2335) (@domasx2)
Expand Down
4 changes: 2 additions & 2 deletions cmd/tempo/app/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import (
"github.com/grafana/tempo/tempodb"
"github.com/grafana/tempo/tempodb/encoding/common"
v2 "github.com/grafana/tempo/tempodb/encoding/v2"
"github.com/grafana/tempo/tempodb/encoding/vparquet"
"github.com/grafana/tempo/tempodb/encoding/vparquet2"
"github.com/stretchr/testify/assert"
)

Expand Down Expand Up @@ -70,7 +70,7 @@ func TestConfig_CheckConfig(t *testing.T) {
name: "warnings for v2 settings when they drift from default",
config: func() *Config {
cfg := newDefaultConfig()
cfg.StorageConfig.Trace.Block.Version = vparquet.VersionString
cfg.StorageConfig.Trace.Block.Version = vparquet2.VersionString
cfg.StorageConfig.Trace.Block.IndexDownsampleBytes = 1
cfg.StorageConfig.Trace.Block.IndexPageSizeBytes = 1
cfg.Compactor.Compactor.ChunkSizeBytes = 1
Expand Down
2 changes: 1 addition & 1 deletion docs/sources/tempo/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1043,7 +1043,7 @@ storage:
# block configuration
block:
# block format version. options: v2, vParquet, vParquet2
[version: <string> | default = vParquet]
[version: <string> | default = vParquet2]

# bloom filter false positive rate. lower values create larger filters but fewer false positives
[bloom_filter_false_positive: <float> | default = 0.01]
Expand Down
79 changes: 71 additions & 8 deletions docs/sources/tempo/configuration/manifest.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ go run ./cmd/tempo --storage.trace.backend=local --storage.trace.local.path=/tmp
## Complete configuration

{{% admonition type="note" %}}
This manifest was generated on 2023-04-28.
This manifest was generated on 2023-06-01.
{{% /admonition %}}

```yaml
Expand Down Expand Up @@ -148,7 +148,7 @@ distributor:
mirror_timeout: 2s
heartbeat_period: 5s
heartbeat_timeout: 5m0s
instance_id: Martins-MacBook-Pro.local
instance_id: local-instance
instance_interface_names:
- eth0
- en0
Expand Down Expand Up @@ -299,12 +299,13 @@ compactor:
heartbeat_timeout: 1m0s
wait_stability_min_duration: 1m0s
wait_stability_max_duration: 5m0s
instance_id: Martins-MacBook-Pro.local
instance_id: local-instance
instance_interface_names:
- eth0
- en0
instance_port: 0
instance_addr: ""
enable_inet6: false
wait_active_instance_timeout: 10m0s
compaction:
v2_in_buffer_bytes: 5242880
Expand Down Expand Up @@ -364,14 +365,15 @@ ingester:
min_ready_duration: 15s
interface_names:
- en0
enable_inet6: false
final_sleep: 0s
tokens_file_path: ""
availability_zone: ""
unregister_on_shutdown: true
readiness_check_ring_health: true
address: 127.0.0.1
port: 0
id: Martins-MacBook-Pro.local
id: local-instance
concurrent_flushes: 4
flush_check_period: 10s
flush_op_timeout: 5m0s
Expand Down Expand Up @@ -414,12 +416,13 @@ metrics_generator:
mirror_timeout: 2s
heartbeat_period: 5s
heartbeat_timeout: 1m0s
instance_id: Martins-MacBook-Pro.local
instance_id: local-instance
instance_interface_names:
- eth0
- en0
instance_addr: 127.0.0.1
instance_port: 0
enable_inet6: false
processor:
service_graphs:
wait: 10s
Expand All @@ -435,6 +438,10 @@ metrics_generator:
- 6.4
- 12.8
dimensions: []
peer_attributes:
- peer.service
- db.name
- db.system
span_multiplier_key: ""
span_metrics:
histogram_buckets:
Expand All @@ -458,11 +465,40 @@ metrics_generator:
span_kind: true
status_code: true
dimensions: []
dimension_mappings: []
enable_target_info: false
span_multiplier_key: ""
subprocessors:
0: true
1: true
2: true
filter_policies: []
local_blocks:
block:
bloom_filter_false_positive: 0.01
bloom_filter_shard_size_bytes: 102400
version: vParquet2
search_encoding: snappy
search_page_size_bytes: 1048576
v2_index_downsample_bytes: 1048576
v2_index_page_size_bytes: 256000
v2_encoding: zstd
parquet_row_group_size_bytes: 100000000
search:
chunk_size_bytes: 1000000
prefetch_trace_count: 1000
read_buffer_count: 32
read_buffer_size_bytes: 1048576
cache_control:
footer: false
column_index: false
offset_index: false
flush_check_period: 10s
trace_idle_period: 10s
max_block_duration: 1m0s
max_block_bytes: 500000000
complete_block_timeout: 1h0m0s
max_live_traces: 0
registry:
collection_interval: 15s
stale_duration: 15m0s
Expand All @@ -479,7 +515,16 @@ metrics_generator:
max_wal_time: 14400000
no_lockfile: false
remote_write_flush_deadline: 1m0s
traces_storage:
path: ""
completedfilepath: ""
blocksfilepath: ""
v2_encoding: none
search_encoding: none
ingestion_time_range_slack: 0s
version: vParquet2
metrics_ingestion_time_range_slack: 30s
query_timeout: 30s
storage:
trace:
pool:
Expand All @@ -492,11 +537,11 @@ storage:
v2_encoding: snappy
search_encoding: none
ingestion_time_range_slack: 2m0s
version: vParquet
version: vParquet2
block:
bloom_filter_false_positive: 0.01
bloom_filter_shard_size_bytes: 102400
version: vParquet
version: vParquet2
search_encoding: snappy
search_page_size_bytes: 1048576
v2_index_downsample_bytes: 1048576
Expand All @@ -523,6 +568,7 @@ storage:
path: /tmp/tempo/traces
gcs:
bucket_name: ""
prefix: ""
chunk_buffer_size: 10485760
endpoint: ""
hedge_requests_at: 0s
Expand All @@ -531,6 +577,13 @@ storage:
object_cache_control: ""
object_metadata: {}
s3:
tls_cert_path: ""
tls_key_path: ""
tls_ca_path: ""
tls_server_name: ""
tls_insecure_skip_verify: false
tls_cipher_suites: ""
tls_min_version: VersionTLS12
bucket: ""
prefix: ""
endpoint: ""
Expand All @@ -539,7 +592,6 @@ storage:
secret_key: ""
session_token: ""
insecure: false
tls_insecure_skip_verify: false
part_size: 0
hedge_requests_at: 0s
hedge_requests_up_to: 2
Expand All @@ -556,6 +608,7 @@ storage:
use_federated_token: false
user_assigned_id: ""
container_name: ""
prefix: ""
endpoint_suffix: blob.core.windows.net
max_buffers: 4
buffer_size: 3145728
Expand Down Expand Up @@ -585,9 +638,19 @@ overrides:
metrics_generator_forwarder_workers: 0
metrics_generator_processor_service_graphs_histogram_buckets: []
metrics_generator_processor_service_graphs_dimensions: []
metrics_generator_processor_service_graphs_peer_attributes: []
metrics_generator_processor_span_metrics_histogram_buckets: []
metrics_generator_processor_span_metrics_dimensions: []
metrics_generator_processor_span_metrics_intrinsic_dimensions: {}
metrics_generator_processor_span_metrics_filter_policies: []
metrics_generator_processor_span_metrics_dimension_mappings: []
metrics_generator_processor_span_metrics_enable_target_info: false
metrics_generator_processor_local_blocks_max_live_traces: 0
metrics_generator_processor_local_blocks_max_block_duration: 0s
metrics_generator_processor_local_blocks_max_block_bytes: 0
metrics_generator_processor_local_blocks_flush_check_period: 0s
metrics_generator_processor_local_blocks_trace_idle_period: 0s
metrics_generator_processor_local_blocks_complete_block_timeout: 0s
block_retention: 0s
max_bytes_per_tag_values_query: 5000000
max_blocks_per_tag_values_query: 0
Expand Down
19 changes: 6 additions & 13 deletions docs/sources/tempo/configuration/parquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,7 @@ weight: 75
# Apache Parquet block format


Tempo has a default columnar block format based on Apache Parquet. Parquet is required for tags-based search as well as [TraceQL]({{< relref "../traceql" >}}), the query language for traces.

A columnar block format may result in improved search performance and also enables a large ecosystem of tools access to the underlying trace data.
Tempo has a default columnar block format based on Apache Parquet. This format is required for tags-based search as well as [TraceQL]({{< relref "../traceql" >}}), the query language for traces. The columnar block format improves search performance and enables a large ecosystem of tools to access the underlying trace data.

For more information, refer to the [Parquet schema]({{< relref "../operations/schema" >}}) and the [Parquet design document](https://github.com/mdisibio/tempo/blob/design-proposal-parquet/docs/design-proposals/2022-04%20Parquet.md).

Expand All @@ -23,26 +21,21 @@ Block formats based on Parquet require more CPU and memory resources than the pr

## Choose a different block format

It is possible to disable Parquet and use the previous `v2` block format. This disables all forms of search, but also reduces resource consumption, and may be desired for a high-throughput cluster that does not need these capabilities. Set the block version option to `v2` in the Storage section of the configuration file.
The default block format is `vParquet2` which is the latest iteration of Tempo's Parquet based columnar block format. It is still possible to use the previous format `vParquet`. To enable it, set the block version option to `vParquet` in the Storage section of the configuration file.

```yaml
# block format version. options: v2, vParquet, vParquet2
[version: v2]
[version: vParquet]
```
There is also a revised version of the Parquet base block format `vParquet2`. This version improves the interoperability with other tools based on Parquet. `vParquet2` is still experimental and not enabled by default yet. To enable it, set the block format version to `vParquet2` in the Storage section of the configuration file.
It is possible to disable Parquet and use the previous `v2` block format. This disables all forms of search, but also reduces resource consumption, and may be desired for a high-throughput cluster that does not need these capabilities. Set the block version option to `v2` in the Storage section of the configuration file.

```yaml
# block format version. options: v2, vParquet, vParquet2
[version: vParquet2]
[version: v2]
```

To re-enable Parquet, set the block version option to `vParquet` in the Storage section of the configuration file.

```yaml
# block format version. options: v2, vParquet, vParquet2
[version: vParquet]
```
To re-enable the default `vParquet2` format, remove the block version option from the Storage section of the configuration file or set the option to `vParquet2`.

## Parquet configuration parameters

Expand Down
Loading

0 comments on commit 05aad36

Please sign in to comment.