You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This can affect all kinds of environments, for example, #3837 and #13151 are examples where the same scenario looks probable, even though we do not have conclusive evidence (or much information about the environment in general).
This can also be a matter of timing: that is, a node does begin syncing its metadata store with peers upon restart but it takes a certain amount of time, even if often this would be fractions of a second,
all while a client connection or CLI tools try to perform an operation that cannot possibly
succeed without all metadata store tables being in place.
Client listeners were moved to the very last step of the boot step graph, and even that does not
help in certain environments. Unless
Describe the solution you'd like
A while ago, probably in late 2023, @dumbbell and I have discussed this and agreed to the following feature:
a configurable startup delay for client listeners.
By default, the delay will be set to 0 to not delay node startup and not affect, for example, every integration test suite in both RabbitMQ itself and its client libraries, various tools that start a RabbitMQ node, and so on.
To introduce a non-zero delay, the user will have to opt-in like so:
# a delay of ten seconds, applied once for all listeners on the nodelisteners.startup_delay = 10
Describe alternatives you've considered
There aren't that many alternatives.
We could consider checking for metadata table/tree presence before starting the listeners but the risk of
false positives will still exist.
An opt-in delay is very easy to reason about, as we have seen in other parts of RabbitMQ (peer discovery, the Raft algorithm's randomized delay during leader elections).
Additional context
No response
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
In certain scenarios, most (in)famously on Kubernetes with a sequential
podManagementPolicy
, nodes can start but fail to sync its metadata store tables from peers.This can affect all kinds of environments, for example, #3837 and #13151 are examples where the same scenario looks probable, even though we do not have conclusive evidence (or much information about the environment in general).
This can also be a matter of timing: that is, a node does begin syncing its metadata store with peers upon restart but it takes a certain amount of time, even if often this would be fractions of a second,
all while a client connection or CLI tools try to perform an operation that cannot possibly
succeed without all metadata store tables being in place.
Client listeners were moved to the very last step of the boot step graph, and even that does not
help in certain environments. Unless
Describe the solution you'd like
A while ago, probably in late 2023, @dumbbell and I have discussed this and agreed to the following feature:
a configurable startup delay for client listeners.
By default, the delay will be set to 0 to not delay node startup and not affect, for example, every integration test suite in both RabbitMQ itself and its client libraries, various tools that start a RabbitMQ node, and so on.
To introduce a non-zero delay, the user will have to opt-in like so:
Describe alternatives you've considered
There aren't that many alternatives.
We could consider checking for metadata table/tree presence before starting the listeners but the risk of
false positives will still exist.
An opt-in delay is very easy to reason about, as we have seen in other parts of RabbitMQ (peer discovery, the Raft algorithm's randomized delay during leader elections).
Additional context
No response
The text was updated successfully, but these errors were encountered: