-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New group consistency algorithm #6404
Conversation
c10da92
to
c66590b
Compare
c66590b
to
2c16e9b
Compare
4c7102e
to
6cfccd5
Compare
39143a1
to
f03bf1d
Compare
6eb0571
to
d7fe3d3
Compare
Sorry for the uninformed input, I just find algorithm questions interesting. What about a timestamp, where if the backup is older than 24 hours then it might prompt other group members via an automated mail to try to get the updated member list? (This should probably be done in some way to avoid that being done repeatedly when the system time is wrong, however.) |
The problem is that in the given algorithm it's not possible to find out whose state for the given membership is actual because timestamps are forgotten after 60 days. We can't store timestamps forever, that wouldn't scale and isn't good for privacy (there should be some time limit after which it shouldn't be known for new members when a particular member was added or removed in the past). As for automated mails in general, currently there are not many such ones, we should avoid them to not reveal the user's network state, whether they restore a backup (as in your question) etc. |
cab88e7
to
71ca2e7
Compare
a306d64
to
165e283
Compare
In the backup the timestamp would still be known, wouldn't it? So if it's super old, I think the natural intuitive thing to do would be to query somebody if they have a state that is newer than 60 days back. Perhaps an alternative would be to mark your own state as possibly faulty and wait for the next message to update it (and until then avoid actions that would propagate your own faulty state to others). I hope this input is useful. |
@ell1e We have a solution idea at #6401 (comment), this is what I plan to implement unless some critical flaw is discovered in it.
I generally don't expect this kind of solution that adds another message to the protocol to work in distributed systems, it is too similar to adding an "acknowledgment to acknowledgment" while trying to solve two generals problem. Such message itself may be lost, delayed, reordered, so we get even more edge cases: 1. Nobody has a new state because the group was inactive for 60 days so nobody replies. 2. Someone replies but super late or the message arrives late. 3. You restore backup and read such reply from IMAP archive which was new at that time but is outdated by now. 4. You get multiple replies but they are reordered so you get older reply first. 5. and so on. |
For those curious, "two generals" refers to https://en.m.wikipedia.org/wiki/Two_Generals'_Problem ... A fundamental consideration about synchronisation in distributed computing. Basically "adding another acknowledgement or request message" typically does not solve problems. There is an equivalent in human communication when people try to synchronize on an appointment clock time and place. Messages are read later or not at all, and there is then uncertainty on both sides because they fail to be sure in the sense that each participant is sure themselves and sure that the other side is sure about the appointment. What is helpful is a) a phone call, or other very low delay communication b) an opportunistic leap from one or the other side ("I'll be at the cafe at 4pm and stick around for an hour or two" can opportunically create synchronization in a single message at the risk of enjoying a cafe visit alone or meeting an interesting stranger or reading a book :) In DC developer discussions it's usually enough to utter "two generals" when countering a network synchronisation protocol suggestion ... A related utterance is "synchronization is hard and expensive". Ten years ago "synchronization architectures" were on vogue also because of Blockchain-mania. A child from those times, Matrix messaging is modeled on top of synchronization while delta uses actor-networking (send and receive of messages). Delta generally uses a P2P networking model with stateless super-dumb transports whereas most other messengers, including supposedly P2P messengers like Briar, use a "centralized sync" model for group messaging. We also looked at several state of the art research papers and other messenger implementations and were surprised we are basically dealing with an unsolved problem after two decades of P2P hype, money and r&d in that space. |
Thanks for explaining! I still feel like such a message could have some use if it's not the only mechanism relied upon (as in only used if it's really recent and useful), but I guess if it works without that and perhaps even better so, fair enough. And the argument that it may signal a backup restore to somebody watching seems somewhat compelling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm almost done reviewing, but have to leave for today
I tried to mark my comments as "blocking" = I think that this needs to be changed and "non-blocking" = I think that it would be nice to change this but if you disagree, it's fine.
An off-topic thought I just had: Only the bigger number of add_timestamp and remove_timestamp is actually important - it never makes any difference what value the lower number has. So, an equivalent algorithm would be to have a column change_timestamp
(or just timestamp
or last_change
) and a column is_member
.
src/chat.rs
Outdated
/// Removes a contact from the chat | ||
/// my updating the `remove_timestamp`. | ||
pub(crate) async fn remove_from_chat_contacts_table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea(non-blocking): What about putting the algorithm (i.e. the content of #6401) into a document and linking to it from remove_from_chat_contacts_table()
, apply_group_changes()
, etc? Alternatively, just linking to the issue?
+1
Anyway we need |
44ac0c8
to
94d81d1
Compare
28e198b
to
3522b2a
Compare
Having multiple additions/removals in one message would be nice, but is not compatible to existing DC clients, so definitely not something for this and follow-up PRs. I also think it should not be done by including the same header multiple times. Including the same header several times is not good for header protection, current header protection draft does not expect this and general assumption is that headers other than |
6fa1f87
to
e2616c7
Compare
src/chat.rs
Outdated
FROM chats_contacts cc | ||
LEFT JOIN contacts c | ||
ON c.id=cc.contact_id | ||
WHERE cc.chat_id=? AND cc.add_timestamp < cc.remove_timestamp | ||
ORDER BY c.id=1, c.last_seen DESC, c.id DESC;", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better sort by remove_timestamp
? But then we shouldn't bump it unnecessarily in other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't matter much, in fact, we could just not sort it at all, whatever - we can discuss about this if and when we decide to show this info in the UI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but we shouldn't bump remove_timestamp
just so, otherwise we need to wait for 60 days (timestamp expiration period) so that the sorting is correct if we decide to sort by remove_timestamp
. Reminding about this just in case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we bump remove_timestamp
unnecessarily?
If we get another timestamp that says someone was removed recently, we need to update remove_timestamp
so it is more effective.
I want to add 60-day expiration after this PR so will look into corner cases anyway, but if you already see some then comments are helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we bump
remove_timestamp
unnecessarily?
It seems that only in update_chat_contacts_table()
, but AFAIU this function is only for compatibility with old Delta Chat and MUA, so maybe that's not critical.
@@ -5,4 +5,5 @@ Msg#11: info (Contact#Contact#Info): Member [email protected] added. [NOTICED][IN | |||
Msg#12: info (Contact#Contact#Info): Member fiona ([email protected]) removed. [NOTICED][INFO] | |||
Msg#13: bob (Contact#Contact#11): Member [email protected] added by bob ([email protected]). [FRESH][INFO] | |||
Msg#14: Me (Contact#Contact#Self): You added member fiona ([email protected]). [INFO] o | |||
Msg#15: bob (Contact#Contact#11): Member fiona ([email protected]) removed by bob ([email protected]). [FRESH][INFO] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks not good, maybe add such no-op messages as hidden=1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same question as the removal of test_no_op_member_added_is_trash
, see the answer above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but if fiona remains in the group, such a message looks like a false statement, maybe we should sort such "delayed" info messages above? We can use the removal timestamp instead of the "Date" header for sorting. For this we can e.g. return from apply_group_chages()
not only the "better message" text, but also its timestamp.
@@ -3470,7 +3481,7 @@ pub async fn create_group_chat( | |||
|
|||
let chat_id = ChatId::new(u32::try_from(row_id)?); | |||
if !is_contact_in_chat(context, chat_id, ContactId::SELF).await? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, what is the purpose of this check? Does the result depend on protect
somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks unnecessary, could be removed I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting, looking forward to seeing how this works in the real world!! I tested it manually on my phone with a very simple group of 3 members, and it worked.
@@ -3618,17 +3637,21 @@ pub(crate) async fn add_to_chat_contacts_table( | |||
Ok(()) | |||
} | |||
|
|||
/// remove a contact from the chats_contact table | |||
/// Removes a contact from the chat | |||
/// my updating the `remove_timestamp`. | |||
pub(crate) async fn remove_from_chat_contacts_table( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes, of course I meant remove_from_chat_members
, remove_from_chat_in_db
, or similar
This implements new group consistency algorithm described in <#6401> New `Chat-Group-Member-Timestamps` header is added to send timestamps of member additions and removals. Member is part of the chat if its addition timestamp is greater or equal to the removal timestamp.
a11f958
to
de63527
Compare
This implements the basic group consistency algorithm described in #6401
It is already ready for review because all the tests pass. Removing tombstones after 60 days and a test for it will be another PR. Important note about handling inactive groups to avoid-readding members if someone restores old backup is at #6401 (comment)
There are some attempts to workaround the case when "member added" (
0a0e715) or "member removed" (1855f84) messages fail to be sent. I am removing them here, going to update
add_timestamp
orremove_timestamp
locally and then do an attempt to send the message out. If the message is not sent because of missing peerstate or because the app is killed, this is the same as having the network lose the message. Especially in the case of removing multiple members by making multiple parallel JSON-RPC or FFI calls it seems better to update the timestamp locally and then send out local state than sending out multiple "member removed" messages which each remove one member but not all the others, not sure if it even worked properly in case the bot or the UI requests multiple member removal in parallel.