-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't write WAL to local disk before it's streamed to a synchronous standby #250
Comments
Sorry, I do not understand how postponing write WAL to the local disk can help to solve the problem with "unapproved" commits. At the moment when we wait for sync replicas, transaction is already marked as committed in CLOG and releasing locks caused by backend termination will make it visible for everybody. How it relates with WAL writing? We can postpone or even completely disable writing WAL to local disks. Instead of it, we can stream it directly to safekeepers. I have done such prototype in PgPro. But performance benefits were not so large. And amount of changes in Postgres core - not so small. And there will be much stronger dependency on speed of consuming WAL by safekeeper/pageserver. So this optimization is non-trivial and not obviously lead to better performance. May be the easiest way to prevent described problem is to prohibit pg_terminate_backend. Looks like it is the only way to terminate one backend without restarting the whole cluster. |
Yeah, you're right, I mixed up two slightly different issues:
Prohibiting pg_terminate_backend() fixes 1. Postponing the local WAL write fixes 2. We might need to do both. |
Right now we are always restarting server from scratch, aren't we? So from my point of view the actual problem is how to detect server crashes and prevent local restarts rather than restart done by console. May be I am missing something, because I am not quite familiar with how console works. |
Yeah, this issue was purely about PostgreSQL. It's not a problem with zenith currently, because as you said, we always rebuild the server from scratch. |
Currently, WAL is always flushed to local disk first, and only then streamed to standby servers. If synchronous replication is used, we just don't acknowledge the commit to the client until the commit record has been streamed. If you kill the backend, the commit goes ahead, the locks are freed, and its effects become visible to other backends, even though it hasn't been streamed out yet. That's a bit sketchy.
The text was updated successfully, but these errors were encountered: