You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We investigated the possibility of removing Lamport clock information of detached clients from Version Vectors (VV) to optimize VV size. Currently, VV grows indefinitely as more clients participate in document editing. While removing detached client information could reduce VV size, we concluded it's better to maintain this information for now due to several technical challenges, particularly in concurrent operation handling.
Background
VV was introduced in v0.5.3 to solve GC (Garbage Collection) problems that couldn't be resolved with Lamport clock alone
Lamport clock provides total ordering but cannot determine concurrency
If a->b then L(a)<L(b)
However, if L(a)<L(b), it could mean either a->b or a||b (concurrent)
VV contains editing information of all clients participating in the document
Current implementation maintains detached client information in VV to ensure proper GC and concurrent operation handling
Problem: Premature GC of nodes deleted by detached clients
Key findings:
GC should only occur when all clients are aware of node deletion
Current implementation filters minVV immediately upon client detachment
This leads to premature filtering before other clients receive the detach change
4. Final Implementation (v0.5.7)
Current process:
Only remove from ColVersionVectors during detach
Keep detached client information in minVV
Allow GC for detached client nodes while maintaining VV size
Proposed Solutions and Limitations
1. Conditional Removal:
Remove from minVV
Store detached user data separately with {actorID:lamport} format (detachedClientLamport)
Remove from minVV only when: minVV[detachedClientID] === detachedClientLamport
This ensures all users have received the detach change
Remove from Local document VV
If we remove the detached client from VV, the lifecycle in both Presence and VV should be identical.
Therefore, instead of filtering attached clients through minVV as before, we can delete them from VV when we receive a presence clear change.
2. Limitation: Concurrent Operation Detection
Main challenge: Cannot reliably determine if missing VV information means:
Client is detached
Client Op hasn't been received yet
Example scenario:
Client A's delete Op VV: [7@A, 4@B]
Client B state: includes nodes 8@C, 9@C
Problem: Cannot determine if C's absence in VV is due to detachment or lack of information
3. Future Considerations:
DAG (Directed Acyclic Graph) structure could help by:
Tracking operation parent information
Verifying if client information was previously received
Decision
We decided to maintain detached client information in VV for now because:
Concurrent operation handling requires complete version history
Current architecture lacks reliable way to distinguish between detached and unknown clients
Removing information could lead to incorrect operation application
Future Work
Implement performance benchmarks to measure VV size impact
Consider DAG implementation for better operation history tracking
Summary
We investigated the possibility of removing Lamport clock information of detached clients from Version Vectors (VV) to optimize VV size. Currently, VV grows indefinitely as more clients participate in document editing. While removing detached client information could reduce VV size, we concluded it's better to maintain this information for now due to several technical challenges, particularly in concurrent operation handling.
Background
a->b
thenL(a)<L(b)
L(a)<L(b)
, it could mean eithera->b
ora||b
(concurrent)Implementation History
1. Introduction of Version Vector (v0.5.3)
removedAt.lamport <= minVersionVector[removedAt.actor]
Initial Attempt to Remove Detached Client's VV
2. GC Enhancement for Detached Client Nodes (v0.5.6)
int64max
when minVersionVector.minLamport() has no valueremovedAt.lamport < minVersionVector.minLamport()
3. Bug Discovery
4. Final Implementation (v0.5.7)
Current process:
Proposed Solutions and Limitations
1. Conditional Removal:
{actorID:lamport}
format (detachedClientLamport)minVV[detachedClientID] === detachedClientLamport
2. Limitation: Concurrent Operation Detection
Main challenge: Cannot reliably determine if missing VV information means:
Example scenario:
3. Future Considerations:
Decision
We decided to maintain detached client information in VV for now because:
Future Work
The text was updated successfully, but these errors were encountered: