Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse cluster config generation optimization #6909

Closed
andrewjstone opened this issue Oct 21, 2024 · 1 comment
Closed

Clickhouse cluster config generation optimization #6909

andrewjstone opened this issue Oct 21, 2024 · 1 comment
Assignees

Comments

@andrewjstone
Copy link
Contributor

When a the reconfiguration executor runs it pushes the latest configuration settings for clickhouse-server and clickhouse-keeper to their corresponding admin servers. These servers generate the XML configuration files which the servers and keepers will automatically reload.

The latest configuration settings are pushed on every execution which runs periodically (every 30s?). We don't want to rewrite the config file and reload it every time if nothing has changed. This burns flash lifetime and leaves the door open to failures to both save the file and reload config.

Instead we should cache the configuration settings, including, most importantly, the generation number of the configuration in a file in the persistent dataset of each clickhouse server and keeper. Then on every configuration we can check to see if the configuration matches the persisted generation. Only if the pushed configuration from the executor has a newer generation do we overwrite the cached settings and rewrite the XML configuration.

While we could solely persist the generation number, we choose to persist all the pushed settings in the cache file. This is solely for debugging purposes.

@andrewjstone
Copy link
Contributor Author

I've spent a long time thinking about this and decided we aren't going to implement it as described. After discussing with @bcantrill he agrees.

The problem with only writing the configuration file if the generation changes is that if a membership change of the keeper fails, then the membership change attempt will not be tried again. The reason is that it's the writing of the keeper configuration that triggers the keeper at the leader node to detect that the proposed configuration is different from the actual configuration and initiates the raft membership change. We want to be able to do that over and over. Note that the keeper will only attempt this if a reconfiguration is not currently in progress, so it is safe to rewrite the configuration and reload during a membership change.

We could do a workaround, such that we poll the keeper configuration and then re-apply the configuration change if the actual raft membership is different from the proposed membership in the config file. But that is brittle and more complicated. We've had the existing solution without the gating working for some time now and it seems silly to add more code and make it possibly less robust.

However, there is still one issue we want to prevent, which is that we have N nexuses operating concurrently. One executor may be running behind and trying to submit an older configuration. We don't want the older configuration to overwrite the newer configuration. Therefore, rather than stopping a rewrite of configuration if we are at version X and receive a configuration at version X, we'll only stop a rewrite of a configuration at version <X if we are on version X currently. This is a different proposal and is actually needed for correctness. It's not just an optimization. I have opened #7137 instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants