-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[postgresql-ha] delete pods in ha break cluster #2805
Comments
Hi @borgez, when you mention this:
Do you mean running a command such as "helm update"? Note that if you do that, secrets are regenerated and if you had a persistent volume, there will be a mismatch between the current secrets and configured secrets, meaning the authentication fails. To fix that, you have a couple of options:
IMPORTANT: These options need to be specified both in the creation (helm install) and update (helm update) of the chart. If you don't have the values for the previous deployment, either delete the existing PVC and lose any previous data (the simpler option) or reset the password to match the existing secret (hard). Finally, re-crete the deployment specifying the secrets/password. |
I tried to break the cluster in different ways to check its fault tolerance.
it break cluster
I not found this in docs
it fix for me, yes But this is not the expected behavior. We have secrets and I expect that when updating nothing will break and existing secrets wil be reuse as default. |
Or may be helm param global.rollPasswords=true, but not by default |
And i think if chart force roll password it need to notify payload about it and change passwords in payload postgresql, repmgr etc, it break kubernetes philosophy |
Hi @borgez By default, if no password is provided by the end user, the Chart will generate and use a random one instead. When performing an upgrade (and not specifying a password) this behaviour is preserved. By providing with the same credentials (or an |
Hi @joancafom Yes, if secrets alredy exists random stage should be skipped but it not... |
I record screencast https://nimb.ws/MkaKpZ |
Hi @borgez I will try to exemplify my words by running an example. Let's say I want to deploy a new release of my chart using helm: $ helm install tauro bitnami/postgresql-ha
NAME: tauro
LAST DEPLOYED: Tue Jun 16 09:43:02 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
** Please be patient while the chart is being deployed **
...
To get the password for "postgres" run:
export POSTGRES_PASSWORD=$(kubectl get secret --namespace default tauro-postgresql-ha-postgresql -o jsonpath="{.data.postgresql-password}" | base64 --decode)
To get the password for "repmgr" run:
export REPMGR_PASSWORD=$(kubectl get secret --namespace default tauro-postgresql-ha-postgresql -o jsonpath="{.data.repmgr-password}" | base64 --decode)
... Once I give some reasonable amount of time for the pods to go up, I am successfully able to see them all running: $ kubectl get pods
NAME READY STATUS RESTARTS AGE
tauro-postgresql-ha-pgpool-5c48ff85f8-hkdwd 1/1 Running 0 3m52s
tauro-postgresql-ha-postgresql-0 1/1 Running 0 3m52s
tauro-postgresql-ha-postgresql-1 1/1 Running 0 3m44s This release also came along with some secrets using that were populated using random passwords, as I did not specify any on installation time: $ kubectl get secrets
NAME TYPE DATA AGE
tauro-postgresql-ha-pgpool Opaque 1 14m
tauro-postgresql-ha-postgresql Opaque 2 14m Now imagine that I want to update the number of replicas, be it one more than current's count. In order to preserve the same passwords and prevent the chart from failing on the update, I must provide the old passwords. We can create two new variables containing them, using the instructions provided on the first command: $ export POSTGRES_PASSWORD=$(kubectl get secret --namespace default tauro-postgresql-ha-postgresql -o jsonpath="{.data.postgresql-password}" | base64 --decode)
$ export REPMGR_PASSWORD=$(kubectl get secret --namespace default tauro-postgresql-ha-postgresql -o jsonpath="{.data.repmgr-password}" | base64 --decode) I can safely perform the update now: $ helm upgrade tauro --set postgresql.password=$POSTGRES_PASSWORD \
--set postgresql.repmgrPassword=$REPMGR_PASSWORD \
--set postgresql.replicaCount=3 \
bitnami/postgresql-ha $ kubectl get pods
NAME READY STATUS RESTARTS AGE
tauro-postgresql-ha-pgpool-64b79f5659-l869k 1/1 Running 0 94s
tauro-postgresql-ha-postgresql-0 1/1 Running 1 32s
tauro-postgresql-ha-postgresql-1 1/1 Running 0 62s
tauro-postgresql-ha-postgresql-2 1/1 Running 2 94s If the passwords weren't provided in the upgrade stage, the Chart will create new pairs of randomly generated credentials, leading to a failure. We are currently considering how to improve this behaviour. I hope this running example helps, thanks! |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
@joancafom hi, you have plan to fix it? |
Hi @borgez We are evaluating different approaches regarding how to skip this complementary information provision when performing an upgrade, but I am afraid I can't give you an ETA. On the meantime, Thanks again! |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
Hi again, We are currently making improvements to prevent this issue from happening. You can check this PR #3150 for further details 😄, but we plan to log some errors when some required fields were not provided in the upgrade. Thanks for your issue! |
@joancafom Great, thank you! |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
I just ran into this problem too. I've manually set |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
Hi , |
Thank you ❤️️ I'll test on our next blue/green cluster swap, but might not be for a while :/ |
This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback. |
I will left this as |
Unfortunately, this issue was created a long time ago and although there is an internal task to fix it, it was not prioritized as something to address in the short/mid term. It's not a technical reason but something related to the capacity since we're a small team. Being said that, contributions via PRs are more than welcome in both repositories (containers and charts). Just in case you would like to contribute. During this time, there are several releases of this asset and it's possible the issue has gone as part of other changes. If that's not the case and you are still experiencing this issue, please feel free to reopen it and we will re-evaluate it. |
Which chart:
postgresql-ha:3.2.9
Describe the bug
postgresql-0, postgresql-1, postgresql-2 cannot start agan because repmgr passwords change when pods recreate
To Reproduce
Expected behavior
The text was updated successfully, but these errors were encountered: