Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Synchronize two Playground instances (#727)
## Description Synchronizes two Playground instances. This is the technical foundation needed to fork entire sites, make changes, merge them back, rebase, undo etc. It's like git for WordPress. https://github.com/WordPress/wordpress-playground/assets/205419/c07795b1-11f9-4221-b638-62db3ea4b017 ## Wait, what? How does it work? We journal local changes, send them to over a remote peer, and replay them there. The following changes are supported: * SQL Queries (INSERT, DELETE, ALTER TABLE etc) * Filesystem changes (create, delete, rename, etc) ## If that's so simple, why doesn't WordPress already support it? This type of sync was never possible before. The secret ingredient here is Playground. We can only keep track of all actions because we have a full control over the filesystem and the database. ## What about conflicting autoincrement IDs? We're sharding IDs to avoid conflicts. For example, peer 1 could start all autoincrement sequences at `12345000001`, while peer 2 could start at `54321000001`. This gives both peers have a lot of space to create records without assigning the same IDs. In some ways, this is similar to [ID sharding once described on Instagram's engineering blog](https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c?gi=2f1ad5d97db2). ## What if we run out of space to assign new IDs? Currently, that would create a conflict and cause the two peers to diverge forever. In the future, we could rewrite these high IDs to reclaim the space. Here's how it could work: 1. Alice assigns a high autoincrement ID (e.g. `1234500001`) and marks it as "dirty" 2. Alice sends the change to Bob 3. Bob finds the available next low ID (e.g. `35`) and establishes a mapping between `1234500001` and `35` 4. Bob starts rewriting all occurences of `1234500001` to `35` in all SQL queries received from Alice 5. Bob rewrites and applies the received query 6. Bob sends a confirmation to Alice that the record was "committed" with ID `35` 7. Alice rewrites all local instances of `1234500001` with `35` like Bob did 8. Alice sends a confirmation to Bob that she reconciled `1234500001` as `35` and Bob may stop rewriting it The rewriting is needed because sometimes ids are stored inside serialized data such as JSON or PHP's `serialize()` output. It's an imperfect heuristics that would occasionally rewrite data that was the same as our ID but had a different meaning, but perhaps it wouldn't happen that often. That's the best we can do anyway. There's no way to reason about the meaning of arbitrary serialized data as it can come from any WordPress plugin. ## Time traveling Wouldn't it be handy to undo a mistake that messed up your site? Well, now you can. The journal is a recipe for getting from a vanilla WordPress to the site you have now. We can replay that recipe on a fresh Playground, stop half-way through, and recover the site you've had a few minutes ago. This opens the door to a WordPress-wide undo button. It's quite similar to what Redux devtools provide. The proof of concept can be accessed at http://localhost:5400/website-server/demos/time-traveling.html: <img width="800" alt="CleanShot 2023-11-04 at 20 43 36@2x" src="https://github.com/WordPress/wordpress-playground/assets/205419/1941fbf4-be63-4546-8965-ebde9d902b2a"> <img width="800" alt="CleanShot 2023-11-04 at 20 43 20@2x" src="https://github.com/WordPress/wordpress-playground/assets/205419/3543aedb-eaf6-4e7d-8a99-e90fcd4ee39c"> ## Testing instructions 1. Run `nx dev` 2. Go to http://localhost:5400/website-server/demos/sync.html 3. Make some changes in either Playground window and confirm that within 5 seconds they're reflected in the other window ### Follow up work - [ ] Sync changes over the network. For now the only transport uses the local `iframe.postMessage`. - [ ] Merge the SQLite translator changes to the upstream `sqlite-database-integration` repo: WordPress/sqlite-database-integration#56 - [ ] Improve the test coverage. Test recording and replaying SQL queries. Test the ID sharding offset instrumentation to ensure it isn't easily derailed. - [ ] Do not send all the files eagerly. Save transfer by computing the hash and only sending what the other peer don't already have (like git). - [ ] Negotiate the ID offset used by different peers. This is to avoid sharding collisions when two peers randomly choose similar offsets. - [ ] Explore ID rewriting to reclaim the ID sharding space (as outlined above). - [x] Implement `normalizeFilesystemOperations()` to be able to transmit files that are created and instantly renamed (see the comment in fs.ts for more details) - [x] Time traveling support by restoring the initial state, removing parts of the journal, and replaying what's left cc @dmsnell
- Loading branch information