Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't write WAL to local disk before it's streamed to a synchronous standby #250

Open
hlinnaka opened this issue Jun 8, 2021 · 4 comments
Labels
c/PostgreSQL Component: PostgreSQL features and bugs

Comments

@hlinnaka
Copy link
Contributor

hlinnaka commented Jun 8, 2021

Currently, WAL is always flushed to local disk first, and only then streamed to standby servers. If synchronous replication is used, we just don't acknowledge the commit to the client until the commit record has been streamed. If you kill the backend, the commit goes ahead, the locks are freed, and its effects become visible to other backends, even though it hasn't been streamed out yet. That's a bit sketchy.

@hlinnaka hlinnaka added the c/PostgreSQL Component: PostgreSQL features and bugs label Jun 8, 2021
@knizhnik
Copy link
Contributor

knizhnik commented Feb 15, 2022

Sorry, I do not understand how postponing write WAL to the local disk can help to solve the problem with "unapproved" commits. At the moment when we wait for sync replicas, transaction is already marked as committed in CLOG and releasing locks caused by backend termination will make it visible for everybody. How it relates with WAL writing?

We can postpone or even completely disable writing WAL to local disks. Instead of it, we can stream it directly to safekeepers. I have done such prototype in PgPro. But performance benefits were not so large. And amount of changes in Postgres core - not so small. And there will be much stronger dependency on speed of consuming WAL by safekeeper/pageserver. So this optimization is non-trivial and not obviously lead to better performance.

May be the easiest way to prevent described problem is to prohibit pg_terminate_backend. Looks like it is the only way to terminate one backend without restarting the whole cluster.

@hlinnaka
Copy link
Contributor Author

Yeah, you're right, I mixed up two slightly different issues:

  1. If you use pg_terminate_backend to kill a backend that has already written the commit record and is waiting for synchronous standby to acknowledge it, the transaction becomes visible to others even though it hasn't been replicated yet.

  2. If you kill the server, e.g. with SIGQUIT, and restart it, when a transaction is in that same state, it becomes visible to others after startup, even though it hasn't been replicated yet.

Prohibiting pg_terminate_backend() fixes 1. Postponing the local WAL write fixes 2. We might need to do both.

@knizhnik
Copy link
Contributor

Right now we are always restarting server from scratch, aren't we?
There is no way to restart server (using CLI) without loosing its local data.
What will happen if server is crashed or killed?
First of all it may not restart at all because local pg_wal contains invalid first segment.
Right now local restart is possible only in case of normal server termination - when checkpoint is completed.
If no checkpoint was performed since compute node start, then server will not be able to restart.
It seems to be the primary problem, which make problem 2) described in this ticket non-relevant.

So from my point of view the actual problem is how to detect server crashes and prevent local restarts rather than restart done by console. May be I am missing something, because I am not quite familiar with how console works.

@hlinnaka
Copy link
Contributor Author

Yeah, this issue was purely about PostgreSQL. It's not a problem with zenith currently, because as you said, we always rebuild the server from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/PostgreSQL Component: PostgreSQL features and bugs
Projects
None yet
Development

No branches or pull requests

2 participants