Clickhouse cluster config generation optimization #6909

andrewjstone · 2024-10-21T16:01:06Z

When a the reconfiguration executor runs it pushes the latest configuration settings for clickhouse-server and clickhouse-keeper to their corresponding admin servers. These servers generate the XML configuration files which the servers and keepers will automatically reload.

The latest configuration settings are pushed on every execution which runs periodically (every 30s?). We don't want to rewrite the config file and reload it every time if nothing has changed. This burns flash lifetime and leaves the door open to failures to both save the file and reload config.

Instead we should cache the configuration settings, including, most importantly, the generation number of the configuration in a file in the persistent dataset of each clickhouse server and keeper. Then on every configuration we can check to see if the configuration matches the persisted generation. Only if the pushed configuration from the executor has a newer generation do we overwrite the cached settings and rewrite the XML configuration.

While we could solely persist the generation number, we choose to persist all the pushed settings in the cache file. This is solely for debugging purposes.

andrewjstone · 2024-11-22T00:40:06Z

I've spent a long time thinking about this and decided we aren't going to implement it as described. After discussing with @bcantrill he agrees.

The problem with only writing the configuration file if the generation changes is that if a membership change of the keeper fails, then the membership change attempt will not be tried again. The reason is that it's the writing of the keeper configuration that triggers the keeper at the leader node to detect that the proposed configuration is different from the actual configuration and initiates the raft membership change. We want to be able to do that over and over. Note that the keeper will only attempt this if a reconfiguration is not currently in progress, so it is safe to rewrite the configuration and reload during a membership change.

We could do a workaround, such that we poll the keeper configuration and then re-apply the configuration change if the actual raft membership is different from the proposed membership in the config file. But that is brittle and more complicated. We've had the existing solution without the gating working for some time now and it seems silly to add more code and make it possibly less robust.

However, there is still one issue we want to prevent, which is that we have N nexuses operating concurrently. One executor may be running behind and trying to submit an older configuration. We don't want the older configuration to overwrite the newer configuration. Therefore, rather than stopping a rewrite of configuration if we are at version X and receive a configuration at version X, we'll only stop a rewrite of a configuration at version <X if we are on version X currently. This is a different proposal and is actually needed for correctness. It's not just an optimization. I have opened #7137 instead.

andrewjstone assigned andrewjstone and karencfv Oct 21, 2024

andrewjstone mentioned this issue Oct 21, 2024

Clickhouse keeper reconfiguration settings: testing and tweaks #6910

Open

andrewjstone mentioned this issue Nov 22, 2024

clickhouse-admin: Reject old configurations #7137

Open

andrewjstone closed this as completed Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clickhouse cluster config generation optimization #6909

Clickhouse cluster config generation optimization #6909

andrewjstone commented Oct 21, 2024

andrewjstone commented Nov 22, 2024

Clickhouse cluster config generation optimization #6909

Clickhouse cluster config generation optimization #6909

Comments

andrewjstone commented Oct 21, 2024

andrewjstone commented Nov 22, 2024