Document how to back up a synapse server #2046

richvdh · 2017-03-22T17:56:15Z

We should give users some guidance on what they need to do to effectively back up and restore a synapse.

Off the top of my head:

database
media repo
homeserver.yaml
log config, where present
signing key (maybe, but it's fine just to use a new one)

seanenck · 2017-04-10T13:15:39Z

definitely interested in this, we're currently doing the things you mention (well, a little more 'verbose' in that I'm pulling /etc/synapse/*)

nordurljosahvida · 2018-03-20T00:27:04Z

Absolutely agree, also interesting is discourse's self backup function that simply asks you for your S3 credentials and does everything by itself. That would be perfect. Thanks for the great work.

ghost · 2020-04-04T14:27:58Z

@richvdh any updates ?

richvdh · 2020-04-06T14:33:20Z

PRs welcome...

kpfleming · 2020-05-25T15:25:22Z

I'm about to do this; moving a Synapse installation from a FreeBSD jail to a Linux container (same CPU architecture, so the data should be compatible). The app configuration and logging configuration is already managed by Ansible so that part is easy, as is the NGINX proxy in front of it and the TLS configuration.

That leaves the database, media repository, and any keys for the server itself. Has anyone done this?

DamianoP · 2020-07-08T13:38:30Z

this is a very interesting question....

krystiancha · 2020-10-03T00:33:04Z

Has anyone done this?

Hey guys, I just moved my synapse instance and everything seems to work including message history and images uploaded in the past.

I transferred:

config: /etc/synapse/homeserver.yaml and /etc/synapse/log_config.yaml
database: /var/lib/postgresql
signing key: /etc/synapse/foo.bar.signing.key
media repo: /var/lib/synapse

nicolamori · 2022-06-07T08:20:14Z

Hi, Is there any progress with this? I'm setting up Synapse + Postgres with docker-compose, and I'm not sure about how to create self-consistent, live, automated backups. In my understanding, to obtain consistent backups the Synapse server should be put in read-only mode or stopped while taking the backup, to avoid that some file is changed while the backup is ongoing. Is this correct? If yes, how to do so for a docker-compose-based setup? I cannot run backup scripts on the host machine and must do everything from within the container.
Sorry for the probably dumb question but I'm a newcomer and I can't find any clear indication or example about this.

reivilibre · 2022-06-07T09:19:33Z

That's not correct actually, you don't need to turn off Synapse to make a consistent backup.

If you use pg_dump, you'll note from its manual that it says https://www.postgresql.org/docs/12/app-pgdump.html

pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).

Basically it runs the entire backup in a single transaction, so Postgres gives it a consistent view the entire time.
(However I do recommend restoring the backups offline and into a fresh, empty database with the correct locale settings. Be very careful not to restore into a database that already has tables present as this has led to issues in the past.)

If you're operating at a large scale, then making SQL dumps of your database is probably inefficient and too slow to restore, so you would probably be considering replication for your Postgres server (including having a hot standby). I can't really advise there myself as I'm not a database expert :-).

I cannot run backup scripts on the host machine and must do everything from within the container.

Curious; why not?

At some level you're going to need to be able to pg_dump your database and make some copies of your media store (and then probably put those backups somewhere so that you're not going to get messed up by a disk failure).
I don't run databases in Docker so I'm not really sure, but I imagine the Docker way here is to have a container whose job is to run pg_dump and save the output somewhere.
Maybe someone can chime in with how they do this in their docker-compose setup? Or perhaps you can find some example online; backing up a Postgres database is not Synapse-specific.

nicolamori · 2022-06-07T09:56:41Z

@reivilibre thanks for the quick and very detailed answer. Let me add some points and clarify some others:

I know about pg_dump, but what about e.g. the media files? Suppose that for taking a full backup I first run pg_dump and then make a tarball of the media folder, and that a user uploads an image from its client in the middle. The backup will then contain the media file but not the database entry (I assume that uploaded images are registered in the DB). Would restoring from this backup lead to an inconsistent Synapse state?
Thanks for the tip about the possible DB dump inefficiency; I'll start simple and eventually review the procedure should this become too slow at my scale
Strictly speaking it's not that I cannot run scripts on the host VM; just I prefer to have a fully-dockerized deployment to be able to quickly migrate it to another "lightly configured" VM (i.e. with just docker installed and no Matrix-specific configuration).
Currently I run a modified postgres image with a cron job executing regular pg_dump runs and uploading dumps to S3 via rclone. All of this happens inside the postgres container, so there's no need to

Iruwen · 2022-06-15T13:57:28Z

My two cents: if you want to avoid most inconsistencies while keeping the server running, you probably need to configure replication and/or snapshots for both the volume holding the media as well as the postgres database. I.e. take a filesystem snapshot (supported by e.g. btrfs) and backup that, then keep or throw away the snapshot; setup replication for postgres and take a backup of the slave at the exact same time (with replication stopped, obviously, or just stop the slave and take a fs snapshot). Different cloud platforms have different ways to aid with the process (e.g. Amazon Fargate, RDS). One could also think about using something like https://github.com/matrix-org/synapse-s3-storage-provider with S3 or something compatible, e.g. a min.io cluster, to achieve maximum data availability and integrity. There's a plethora of ways to solve the problem to different degrees, which are all out of scope of Synapse itself. Even if it all works, there's still a probability that some buffered/cached data hasn't been written or replicated yet. The question when it comes to backups is "what is good enough". Ideally you avoid ever needing a backup to begin with, which would require HA capabilities, which Synapse doesn't have (yet).

reivilibre · 2022-06-15T14:12:38Z

@nicolamori

The backup will then contain the media file but not the database entry (I assume that uploaded images are registered in the DB). Would restoring from this backup lead to an inconsistent Synapse state?

This is true, but it's not a big deal — the only cost there is the wasted disk space if you restore from this backup and don't clean it out.
If you back up your database first and then 'rsync' your media directory somewhere, your database will be consistent and Synapse won't necessarily ever notice.
If you do the other way around, you might lose some media files that then have DB entries, but it may not be a big deal for your use case.

You'll probably find it much easier to keep it simple. Even if you lose some media that are tracked in the database, it's not going to be the end of the world — you might get errors downloading that piece of media but other than that, nothing too bad will happen.

@Iruwen makes some good points but I'd argue these are probably a lot more fiddly and complicated than many 'home users' care about — e.g. a loss of a few hours' worth of data isn't likely a big problem to me personally, so frequent pg_dumps are fine for me and I haven't bothered with database replication or storing media on a redundant cluster like minio.

nicolamori · 2022-06-15T14:20:56Z

@reivilibre thanks for the insights. I also understand and appreciate @Iruwen point of view, but I'd definitely keep it simple unless it might badly screw everything. Eventually loosing some media is not an issue for me, so I'd go with the plain pgsql dump (media are on Minio so I don't explicitly backup them).

Iruwen · 2022-06-16T11:29:25Z

One should maybe note that the system doesn't fall apart when there are inconsistencies between media and its references stored in the datase, it'll just be missing. Otherwise things like event/media retention policies would be a much bigger issue.

PS: replication is not a backup method - if you face any kind of data corruption, you'll end up with a distributed mess.

gwire · 2022-08-02T11:05:19Z

I'm currently backing up /etc/matrix-synapse/, a dump of the database, and the media directories.

The disk requirements are growing faster than I'd anticipated, so I was looking for documentation to tell me:

is it safe to skip backing up url_cache_thumbnails and url_cache? (the cache in the name suggests so) will these be repopulated if seen by clients?
is it safe to skip backing up *_thumbnails? If absent, will the server recalculate these files on demand?
is it safe to skip backing up remote_content? If absent, will the server repopulate these files on demand?

(I appreciate that remote resources can be withdrawn at any time, but I'm more interested in ensuring resources used for backups are used to be able to reestablish the local service.)

Does the database similarly contain remote server content, and if so is there a way to take a selective dump of local content in such a way that remote content would be repopulated on demand?

youphyun · 2022-08-31T11:26:20Z

I am still new to the topic. Simply started to backup the listed and relevant files including the full Postgres db using pg_dumpall. The whole process including locations to backup will differ depending on how synapse is installed (from repository, docker, or in a virtualenv). I am not sure if and when I will need to restore the backups. I am afraid that will be quite some manual work. I found these pages with some useful details: https://www.gibiris.org/eo-blog/posts/2022/01/21_containterise-synapse-postgres.html and https://ems-docs.element.io/books/element-cloud-documentation/page/import-database-and-media-dump
One additional question to the media repo, will simply restoring the /media_store folder work or should it be rather done using the export and import Synapse API calls?

FarisZR · 2023-10-14T18:13:16Z

I'm currently backing up /etc/matrix-synapse/, a dump of the database, and the media directories.

The disk requirements are growing faster than I'd anticipated, so I was looking for documentation to tell me:

is it safe to skip backing up url_cache_thumbnails and url_cache? (the cache in the name suggests so) will these be repopulated if seen by clients?

is it safe to skip backing up *_thumbnails? If absent, will the server recalculate these files on demand?

is it safe to skip backing up remote_content? If absent, will the server repopulate these files on demand?

(I appreciate that remote resources can be withdrawn at any time, but I'm more interested in ensuring resources used for backups are used to be able to reestablish the local service.)

Does the database similarly contain remote server content, and if so is there a way to take a selective dump of local content in such a way that remote content would be repopulated on demand?

@gwire have you found an answer yet? because remote_content is crazy large in my server, and it doesn't make sense at all to back it up.

Fixes: element-hq/element-meta#2155 Fixes: matrix-org/synapse#2046

richvdh mentioned this issue Mar 22, 2017

Documentation: Migrating user data #2040

Closed

neilisfragile added the A-Docs things relating to the documentation label Mar 20, 2018

clokep mentioned this issue Jan 12, 2021

Should mention in documentation what Synapse stores where. #9087

Closed

reivilibre self-assigned this Aug 2, 2021

reivilibre added the T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. label Aug 3, 2021

richvdh added the Z-Help-Wanted We know exactly how to fix this issue, and would be grateful for any contribution label Jul 29, 2022

m-amir-ir mentioned this issue Dec 18, 2022

multi Masters or Active/Passive Masters spantaleev/matrix-docker-ansible-deploy#2349

Closed

iambeingtracked mentioned this issue Jun 27, 2023

Migrating from matrix-docker-ansible-deploy to a usual binary installation of Synapse spantaleev/matrix-docker-ansible-deploy#2768

Closed

This comment was marked as off-topic.

Sign in to view

matrixbot mentioned this issue Dec 21, 2023

Document how to back up a synapse server element-hq/synapse#2046

Closed

Gredin67 mentioned this issue Aug 22, 2024

Add config panel setting to not stop synapse during backup YunoHost-Apps/synapse_ynh#482

Closed

richvdh added a commit to element-hq/synapse that referenced this issue Nov 14, 2024

Add some documentation about backing up Synapse

234d9b3

Fixes: element-hq/element-meta#2155 Fixes: matrix-org/synapse#2046

richvdh mentioned this issue Nov 14, 2024

Add some documentation about backing up Synapse element-hq/synapse#17931

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document how to back up a synapse server #2046

Document how to back up a synapse server #2046

richvdh commented Mar 22, 2017 •

edited

Loading

seanenck commented Apr 10, 2017

nordurljosahvida commented Mar 20, 2018

ghost commented Apr 4, 2020

richvdh commented Apr 6, 2020

kpfleming commented May 25, 2020

DamianoP commented Jul 8, 2020

krystiancha commented Oct 3, 2020

nicolamori commented Jun 7, 2022

reivilibre commented Jun 7, 2022

nicolamori commented Jun 7, 2022

Iruwen commented Jun 15, 2022

reivilibre commented Jun 15, 2022

nicolamori commented Jun 15, 2022

Iruwen commented Jun 16, 2022

gwire commented Aug 2, 2022

youphyun commented Aug 31, 2022 •

edited

Loading

FarisZR commented Oct 14, 2023 •

edited

Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Document how to back up a synapse server #2046

Document how to back up a synapse server #2046

Comments

richvdh commented Mar 22, 2017 • edited Loading

seanenck commented Apr 10, 2017

nordurljosahvida commented Mar 20, 2018

ghost commented Apr 4, 2020

richvdh commented Apr 6, 2020

kpfleming commented May 25, 2020

DamianoP commented Jul 8, 2020

krystiancha commented Oct 3, 2020

nicolamori commented Jun 7, 2022

reivilibre commented Jun 7, 2022

nicolamori commented Jun 7, 2022

Iruwen commented Jun 15, 2022

reivilibre commented Jun 15, 2022

nicolamori commented Jun 15, 2022

Iruwen commented Jun 16, 2022

gwire commented Aug 2, 2022

youphyun commented Aug 31, 2022 • edited Loading

FarisZR commented Oct 14, 2023 • edited Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

richvdh commented Mar 22, 2017 •

edited

Loading

youphyun commented Aug 31, 2022 •

edited

Loading

FarisZR commented Oct 14, 2023 •

edited

Loading