-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storing timelines in the state store #138
Comments
How should we store the timeline? We will need to preserve the ordering we receive from the server, so we can't use a Sled::Tree (at least not directly). |
An additional mapping/tree from an index (u128) to the event id would work, no? Perhaps @DevinR528 has opinions about this. Though that's an implementation detail of the store, e.g. a SQL based store would just use a |
So my thinking was to store the timeline slices in order (and since the events are ordered in the slice, depending on
pub struct SledStore {
// ...
/// Order the chunks by their `prev_batch`.
///
/// We now have slices of the timeline ordered by position in the timeline. Since the events
/// come in a predictable order if we find the prevbatchid location we know the location.
id_prevbatch: Tree,
/// Since the `prev_batch` token can get long and we want an orderable key use a `u64`.
/// This is also known as prev_batch ID.
prevbatch_id: Tree,
/// This `Tree` consists of keys of `prev_batch ID + count` and values of `EventId` bytes.
///
/// `count` is the index of the MessageEvent in the original chunk or sync response.
prevbatchid_eventid: Tree,
eventid_prevbatchid: Tree,
/// This `Tree` is `RoomId + EventId -> Event bytes`
timeline: Tree,
} This would make it so we can fill in /messages responses, by |
We would still have to guard against duplicate events and overlapping and underlapping chunks which I'm not entirely sure how to handle |
An
This is looking a bit too much into what sled should be doing, let's first agree on a more abstract level what we need, implementation details for specific store implementations can come later. Or in other words let's add a bit of structure to those strings and numbers. Looking at the sync spec, when we sync we get a slice of the timeline that exists between two tokens. struct TimelineSlice {
start: String,
end: String,
events: Vec<EventId>,
} The same thing applies when we call the Storing the slice like this isn't ideal, due to the nature of struct SliceIndex(u128);
struct EventIndex(u128);
struct TimelineSlice {
slice_id: SliceIndex,
start: String,
end: String,
}
struct EventOwnerMap {
slice_map: BTreeMap<SliceId, BTreeMap<EventIndex, EventId>>,
event_map: BTreeMap<EventId, SliceId>,
}
struct Timeline {
slices: BTreeMap<SliceId, TimeLineSlice>
// This should probably be the prev_batch token so we
// don't need two maps and in the general case where
// we walk backwards in the timeline, we won't need to
// fetch the neighboring slice
token_slice_map: BTreeMap<String, SliceId>,
events: EventOwnerMap,
} So now we can get the slices in order, and events belonging to a slice in order. When we put a new slice from a sync into the timeline we get the last slice, check if the tokens match, if they do populate the existing slice with events and update the end token. When we on the other hand receive events from the Another thing we might need/want to store is the token -> slice -> event index position. E.g. if someone stores a Hopefully this makes any sense, and somewhat also points to how sled should store stuff. Of course if this has some flaw I'm missing please let me know. |
These would be stucts that are the
struct EventOwnerMap {
slice_map: BTreeMap<SliceId, BTreeMap<EventIndex, EventId>>,
event_map: BTreeMap<EventId, SliceId>,
// Now we can go from an eventId to a sub-slice of it's parent SliceId, I don't think we could before?
event_index: BTreeMap<EventId, EventIndex>,
} Just to make sure, the
EventId's would point to the chunk (SliceId) they came from? I think I'm following along pretty well. we'll see soon enough 😄 ! |
I think that depends on the store, the memory store will probably keep those as is. I'm not completely sure yet if the store should create those and the equivalent of
Yeah that sounds sensible, it might need to store the position inside the slice as well, since you otherwise don't know where to continue to return events from. Sorry I forgot to respond to this. |
As a side note: there is also https://matrix.org/docs/spec/client_server/r0.6.1#get-matrix-client-r0-rooms-roomid-context-eventid which is similar to |
i missed a |
It will certainly be needed, if clients want to have a similar search experience like Element does, E2EE search will use this. |
Closed by #486. |
The state store used to be able to store some limited amount of events from the timeline, while the features seemed to be a bit flaky and due to the snapshot based state storage quite limited it was generally useful.
We should reintroduce this feature in the new state store.
Since the new state store keeps stuff out of memory we can without a bad conscience store the whole timeline in it. The state store trait should be expanded to allow this.
For some basic functionality, we'll need:
How gaps in the timeline should be handled needs to be figured out.
Each sync response will contain a slice of a timeline, the timeline might have been a continuation of the previous sync response, in which case the
limited
parameter will be set tofalse
. If the timeline doesn't contain all events that happened between two syncs thelimited
parameter will be set totrue
and a gap exists in the timeline. More info can be found over here.We'll likely will need to remember which event belongs to which slice and figure out how to merge slices when we request historic events using the
/rooms/$room_id/messages
endpoint.Our
room_messages()
and similar methods that fetch events from the server should use events from the store if they are available and fill in from the server if not.The text was updated successfully, but these errors were encountered: