Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New health checks for metadata store initialization (backport #13167) #13169

Merged
merged 3 commits into from
Jan 29, 2025

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Jan 29, 2025

This introduces two new health checks

  • rabbitmq-diagnostics check_if_metadata_store_is_initialized
  • rabbitmq-diagnostics check_if_metadata_store_is_initialized_with_data

and their HTTP API counterparts:

  • GET {prefix}/health/checks/metadata-store/initialized
  • GET {prefix}/health/checks/metadata-store/initialized/with-data

The first one relies on rabbit_db:is_init_completed/0, which is already used by a few
code paths, namely peer discovery, to detect when the metadata store initialization has
completed.

The second check is more opinionated: it assumes that a cluster will always have
at least one virtual host. Technically you can delete the only virtual host in the system
but then the cluster would not be practically useful.

So let's use this clue, at least one virtual host that the metadata store contains,
as a good enough indication that the metadata store has synced "just enough" for client
connections to have a chance of succeeding.

Note that these checks cannot be 100% accurate in the case of Mnesia because some data may still be in flight but should be pretty accurate in the case of Khepri, which is
the future of RabbitMQ in any case.

In any case, we currently do not provide a health check like this, which makes the problem
outlined in #13153 harder to spot and comprehend.

References #13153.


This is an automatic backport of pull request #13167 done by Mergify.

@michaelklishin michaelklishin added this to the 4.0.6 milestone Jan 29, 2025
@michaelklishin michaelklishin merged commit 9409baf into v4.0.x Jan 29, 2025
268 of 270 checks passed
@michaelklishin michaelklishin deleted the mergify/bp/v4.0.x/pr-13167 branch January 29, 2025 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant