-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Automatic database migration from 0.23.0 to 0.24.0 does not work with postgres #2351
Comments
This is unfortunate, I will not have time to look into this until next week, but in the meantime if someone has a Postgres backup that can reproduce this issue, I would appreciate getting it. As often mentioned we don't have the time to do the extensive testing for Postgres as we have so much other things to fix. A personal rant; |
I do have pg_dumpall backups of the entire postgres server on that machine as part of the daily system backups (useful for restoring after a storage failure, less so for restoring individual tables in the heat of the moment). I could get you a copy of the headscale DB from my server from the night before I attempted the upgrade, though I'll need to find some free time next week to spin up a test instance of postgres which I can load the backup into first in order to extract the headscale parts (and censor things like IP addresses). What would be the best way to send you the DB dump? |
Great, email in my GitHub would be sufficient |
Fixes juanfont#2351 Signed-off-by: Kristoffer Dalby <[email protected]>
Fixes juanfont#2351 Signed-off-by: Kristoffer Dalby <[email protected]>
* fix postgres migration issue with 0.24 Fixes juanfont#2351 Signed-off-by: Kristoffer Dalby <[email protected]> * add postgres migration test for 2351 Signed-off-by: Kristoffer Dalby <[email protected]> * update changelog Signed-off-by: Kristoffer Dalby <[email protected]> --------- Signed-off-by: Kristoffer Dalby <[email protected]>
Given you've found a fix already -- do you still need a copy of my headscale database out of my backups? |
No, thank you, I got one from another user and wrote a test based on that. |
* fix postgres migration issue with 0.24 Fixes juanfont#2351 Signed-off-by: Kristoffer Dalby <[email protected]> * add postgres migration test for 2351 Signed-off-by: Kristoffer Dalby <[email protected]> * update changelog Signed-off-by: Kristoffer Dalby <[email protected]> --------- Signed-off-by: Kristoffer Dalby <[email protected]>
Hi @kradalby Just a heads up, somehow the migration in the fix PR have no effect and produce the same error. Upgrading from 0.22.3 -> 0.24.2 with Postgres. If you wanted to investigate further, I could sent you my SQL dump v0.22.3. |
Yes please, it worked from 0.23, but maybe the step from 0.22 was different, 0.23 is the time we introduced migrations so it would not surprise me. Email is in my profile |
@kradalby |
@kradalby i've send my Postgres DB Dump to your email. Let me know if you need other info. |
I've (finally) upgraded from 0.21.0 by starting up every release along the way in case it was needed, but haven't been able to get releases past 0.24.2 working. On 0.23.0 I got this on first startup after migrations: 2025-02-25T21:50:20Z FTL home/runner/work/headscale/headscale/cmd/headscale/cli/serve.go:29 > Headscale ran into an error and had to shut down. error="failed to load ACL policy: loading nodes from database to validate policy: ERROR: cached plan must not change result type (SQLSTATE 0A000)" then it worked next time I started the service: From 0.24.0 onward I get this: 2025-02-25T21:51:43Z FTL Migration failed: ERROR: constraint "uni_users_name" of relation "users" does not exist (SQLSTATE 42704) error="ERROR: constraint \"uni_users_name\" of relation \"users\" does not exist (SQLSTATE 42704)" If I manually add the constraint via psql, I can run 0.24.0 through 0.24.2: alter table users add constraint uni_users_name unique (name); But if I try to upgrade to 0.24.3 or newer, I get this error on first startup: 2025-02-25T22:11:47Z FTL Migration failed: automigrating types.Node: ERROR: constraint "uni_users_name" of relation "users" does not exist (SQLSTATE 42704) error="automigrating types.Node: ERROR: constraint \"uni_users_name\" of relation \"users\" does not exist (SQLSTATE 42704)" and it removes the |
I haven’t had a close look, but starting every release would be counter productive. Since there are migration fixes in the fix releases, you should jump to the latest release or at least the latest fix release. We have not had a database that old in a bit, so if going from 0.21 straight to 0.24.3 or 0.25.1 doesn’t work I would appreciate a scrubbed copy I can write a test against. |
For me, simply spamming the Alter table in Postgres manually to add the |
Is this a support request?
Is there an existing issue for this?
Current Behavior
I tried to update my headscale instance, which uses postgres as the database, from 0.23.0 to 0.24.0 using the Debian packages provided as part of the releases on Github. However, after installing the new package, headscale failed to start due to problems with the database migration, with the following message in the logs:
I was also left unable to blindly downgrade to 0.23.0, as the new version had already executed a migration successfully before encountering the error, leaving my database in an inconsistent state that would not have been supported by the old version.
Expected Behavior
Headscale executes the database migration without errors, and then proceeds to function normally.
Steps To Reproduce
Environment
Runtime environment
Anything else?
I manually inspected the migrations table in the database and compared this with the code in 0.24.0, which indicates that this was a problem with migration
202407191627
. This is an automatic migration executed by gorm to update the schema of theusers
table -- I'm not sure if there have been changes in gorm, but I'm not familiar with the library so I didn't investigate that further.As I mentioned above, my database was left in an inconsistent state which prevented me from downgrading (in fairness I should have backed up my database before performing the upgrade...). However, my database did have a uniqueness constraint for
users.name
as implied by the log message above, but in my database the constraint was calledusers_name_key
instead ofuni_users_name
.I used the following SQL to rename the constraint on the
users
table, and with this change the migration which had been causing me problems then executed correctly:My headscale instance now appears to be working correctly, though I haven't tried adding new users or nodes to the Tailnet yet.
The text was updated successfully, but these errors were encountered: