Quorum Queue compaction behavior #13103

kamilzzz · 2025-01-20T14:06:28Z

kamilzzz
Jan 20, 2025

Community Support Policy

I have read RabbitMQ's Community Support Policy

RabbitMQ version used

4.0.3

How is RabbitMQ deployed?

RPM package

Steps to reproduce the behavior in question

RabbitMQ version - 4.0.5

We have 3 node cluster each having 20 GB of space available for RabbitMQ storage.
All queues are quorum queues with replication factor of 3.

During some tests we noticed we quite often run out of space due to segments not being compacted, even when queues are empty or almost empty. Sometimes one node compacts its segments but others do not which in the end leads to disk alarms which blocks all producers.

For example, current situation of one our queues: leader compacted its segments already several times (I was monitoring that) while followers did not compact in a "long" time (1 hour) meaning they are very close to disk alarm.

Status of quorum queue <redacted> on node rabbit@01 ...
lqqqqqqqqqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqqqqqwqqqqqqqqqqqqqqwqqqqqqqqqqqqqqwqqqqqqqqqqqqqqwqqqqqqqqqqqqqqqqwqqqqqqwqqqqqqqqqqqqqqqqqk
x Node Name          x Raft State x Membership x Last Log Index x Last Written x Last Applied x Commit Index x Snapshot Index x Term x Machine Version x
tqqqqqqqqqqqqqqqqqqqqnqqqqqqqqqqqqnqqqqqqqqqqqqnqqqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqqqnqqqqqqnqqqqqqqqqqqqqqqqqu
x rabbit@LNX00003014 x leader     x voter      x 79849          x 79849        x 79849        x 79849        x 76465          x 6    x 5               x
tqqqqqqqqqqqqqqqqqqqqnqqqqqqqqqqqqnqqqqqqqqqqqqnqqqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqqqnqqqqqqnqqqqqqqqqqqqqqqqqu
x rabbit@LNX00003015 x follower   x voter      x 79849          x 79849        x 79849        x 79849        x 51877          x 6    x 5               x
tqqqqqqqqqqqqqqqqqqqqnqqqqqqqqqqqqnqqqqqqqqqqqqnqqqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqnqqqqqqqqqqqqqqqqnqqqqqqnqqqqqqqqqqqqqqqqqu
x rabbit@LNX00003016 x follower   x voter      x 79849          x 79849        x 79849        x 79849        x 51273          x 6    x 5               x
mqqqqqqqqqqqqqqqqqqqqvqqqqqqqqqqqqvqqqqqqqqqqqqvqqqqqqqqqqqqqqqqvqqqqqqqqqqqqqqvqqqqqqqqqqqqqqvqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqvqqqqqqvqqqqqqqqqqqqqqqqqj

As you can see, Snapshot Index for leader is 76465, while for followers it is ~51000.
Majority of messages are quite large (~1 MB), so while 20k index difference between snapshot and last log index may not seem that much, this means we can have 10-20 GB of uncompacted data on some nodes.

Why is that leader compacted already several times while at the same time followers are not compacting their segments?
Are there any configuration options so we can tweak that? What are the "triggers" for compation?

I stopped producers/consumers processes to prevent triggering disk alarms, but queues are still not compacting after some time. Is it possible to force compaction via CLI?

Answered by michaelklishin

Jan 27, 2025

We have documented this suggestion for installations where average message size is (expected to be) large.

View full answer

michaelklishin · 2025-01-20T17:37:46Z

michaelklishin
Jan 20, 2025
Maintainer

@kamilzzz no, it is not possible to trigger compaction via CLI tools.

Start with

Resource Use in the QQ guide, plus the following section on redelivery loops
Consumer delivery timeout

0 replies

michaelklishin · 2025-01-20T18:04:39Z

michaelklishin
Jan 20, 2025
Maintainer

Our Raft implementation does not perform compaction per se, if "compaction" defined as the process of combining data from N segment files with holes (acknowedged deliveries) into a new file without holes.
Classic queues do that but not quorum queues.

The process is called "segment truncation" instead, and as the name suggests, it operates at the segment file level.

With multi-megabyte messages, using a smaller segment file message count is something to consider.

The only thing that stands out from the rabbitmq-queues quorum_status is the difference in snapshot index.

rabbitmq-diagnostics observer

will provide access a few more QQ metrics.

The general recommendation for large messages has always been "put them into a blob store and pass the relevant metadata around in messages".

0 replies

michaelklishin · 2025-01-20T18:11:19Z

michaelklishin
Jan 20, 2025
Maintainer

One hypothesis I have is that it is not the segment files that explain the differences, it is snapshot files. Snapshots are sent from the leader to the followers, never the other way around, when a follower recovers (e.g. after a node restart).

@kamilzzz see what files actually consume the disk space under the node's data directory, then under quorum, it will look something like this:

└── rabbit@{hostname}
    ├── 00000039.wal
    ├── 2F_QQ15E8ROQHB1PH1
    │   ├── 00000001.segment
    │   ├── checkpoints
    │   ├── config
    │   └── snapshots
    ├── meta.dets
    └── names.dets

I think that when the followers are up-to-date (as seen from the commit index), the snapshots can be safely deleted. Followers likely keep one of them around and usually no one even notices when messages are small.

@kjnilsson would know best.

0 replies

kjnilsson · 2025-01-20T20:45:08Z

kjnilsson
Jan 20, 2025
Maintainer

@kamilzzz please can you provide a description of the tests you've run, message sizes, throughput rates etc or even better provide a set of command lines for our perftest that I can use to reproduce it.

Snapshotting changed a lot in 4.0 so there may be scenarios we didn't manage to cover as well as others.

20GB is too small for 1MB QQ workloads really. I would go with something 100GB+.

0 replies

michaelklishin · 2025-01-21T16:13:48Z

michaelklishin
Jan 21, 2025
Maintainer

An earlier discussion that seems highly relevant because the behavior there was the "Snapshot index" discrepancy: #12147.

0 replies

kamilzzz · 2025-01-27T15:48:59Z

kamilzzz
Jan 27, 2025
Author

@michaelklishin @kjnilsson Thanks for responses.
I am sure actual segment files were occupying disk space and not snapshots.

We had raft.segment_max_entries set to 32768 (https://www.rabbitmq.com/docs/quorum-queues#tuning-raft-segment-file-entry-count).
After changing this setting to smaller value (currently set to 1024) as suggested by @michaelklishin, we didn't encounter any issues since a week running similar load as before.

2 replies

michaelklishin Jan 27, 2025
Maintainer

We have documented this suggestion for installations where average message size is (expected to be) large.

Answer selected by michaelklishin

michaelklishin Jan 27, 2025
Maintainer

Thank you for confirming the effectiveness of that recommendation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quorum Queue compaction behavior #13103

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 2 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Quorum Queue compaction behavior #13103

kamilzzz Jan 20, 2025

Community Support Policy

RabbitMQ version used

How is RabbitMQ deployed?

Steps to reproduce the behavior in question

Replies: 6 comments · 2 replies

michaelklishin Jan 20, 2025 Maintainer

michaelklishin Jan 20, 2025 Maintainer

michaelklishin Jan 20, 2025 Maintainer

kjnilsson Jan 20, 2025 Maintainer

michaelklishin Jan 21, 2025 Maintainer

kamilzzz Jan 27, 2025 Author

michaelklishin Jan 27, 2025 Maintainer

michaelklishin Jan 27, 2025 Maintainer

kamilzzz
Jan 20, 2025

Replies: 6 comments 2 replies

michaelklishin
Jan 20, 2025
Maintainer

michaelklishin
Jan 20, 2025
Maintainer

michaelklishin
Jan 20, 2025
Maintainer

kjnilsson
Jan 20, 2025
Maintainer

michaelklishin
Jan 21, 2025
Maintainer

kamilzzz
Jan 27, 2025
Author

michaelklishin Jan 27, 2025
Maintainer

michaelklishin Jan 27, 2025
Maintainer