Manager controller recreates clusters when manager cluster ID is missing from status #1902
Labels
kind/bug
Categorizes issue or PR as related to a bug.
priority/important-soon
Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
What happened?
Currently, the manager cluster ID is saved in ScyllaCluster's status on cluster creation. If the controller fails to update ScyllaCluster's status, the ID is lost, or an older generation of the object is reconciled, the controller will delete the existing cluster from the manager state and create it again.
The issue and its root cause are similar #1752.
This not only adds a superfluous workload, but may introduce incorrectness, involving e.g. task retention.
/priority important-soon
/assign
What did you expect to happen?
Clusters in manager state should not be deleted once they've been created successfully.
How can we reproduce it (as minimally and precisely as possible)?
n/a
Scylla Operator version
master
Kubernetes platform name and version
n/a
Please attach the must-gather archive.
n/a
Anything else we need to know?
Unfortunately, we now have no reliable way of telling whether a cluster existing in manager state corresponds to a K8S object if we don't have the ID.
This should be easy to fix with scylladb/scylla-manager#3219, since we'll be able to save metadata in manager state, and so we'll be able to "reclaim" the cluster despite not having its ID.
The text was updated successfully, but these errors were encountered: