Only trim logs to min persisted lsn across all known nodes #1781

tillrohrmann · 2024-08-02T08:32:19Z

Until we have support for creating a state snapshot and making this state snapshot accessible to all nodes, we must not trim the log if one of the known nodes lags behind. If we wanted to support that new nodes can join the cluster at any point in time, then we must never trim the log because a new node will have to replay the log from the beginning for a given partition processor.

Until we can share partition processor snapshots between Restate nodes (e.g. by fetching them from S3), we can only trim the log if all known nodes have reached the trim point. Otherwise, we risk that a node that is currently not available needs log entries which were trimmed. One crucial assumption is that no new nodes will join the cluster once the first log trimming has happened. For this to work, we also need the sharing of partition processor snapshots. This fixes restatedev#1781.

tillrohrmann mentioned this issue Aug 2, 2024

Distributed Restate preview #1675

Closed

tillrohrmann self-assigned this Aug 2, 2024

tillrohrmann mentioned this issue Aug 2, 2024

Check if all nodes have reported a persisted lsn before trimming the log #1783

Merged

tillrohrmann closed this as completed in 51075c1 Aug 7, 2024

tillrohrmann mentioned this issue Aug 8, 2024

Reenable proper trimming based on completed checkpoints #1812

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only trim logs to min persisted lsn across all known nodes #1781

Only trim logs to min persisted lsn across all known nodes #1781

tillrohrmann commented Aug 2, 2024 •

edited

Loading

Only trim logs to min persisted lsn across all known nodes #1781

Only trim logs to min persisted lsn across all known nodes #1781

Comments

tillrohrmann commented Aug 2, 2024 • edited Loading

tillrohrmann commented Aug 2, 2024 •

edited

Loading