This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Clarify
(room_id, event_id)
global uniqueness #13701Clarify
(room_id, event_id)
global uniqueness #13701Changes from 4 commits
25fa1c4
14c1dd5
4fb453f
5dd24f3
893d4c5
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems rather sad considering they are very much not and it's not very hard to cause a conflict :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can hope for MSC2848 🙌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think hash collisions from sheer probability are not much of a justifiable problem (maybe SHA256 will get defeated one day I suppose...?)
SHA256 is 256-bit, so only once you have 2^128 events would you have 0.5 probability of having a collision. That's way more events than I think anyone will ever store.
I expect the main problem is probably intentional collisions (esp in v1 rooms), where namespacing events by room means that we don't let a bad actor interfere with any rooms they're not in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 Probably, maybe
The rest of my reply is just linking stuff from my own curiosity:
SHA-1 attack, https://github.blog/2017-03-20-sha-1-collision-detection-on-github-com/
Other reading:
https://spec.matrix.org/v1.1/rooms/v3/#event-ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused about where we've ended up on this thread: are hash collisions (feasibly) possible or not?
Room v1 and v2 have bigger problems than event-id clashes between rooms. The solution to that is to stop using v1 and v2 rooms, not to arrange the entire database schema and matrix API around a half-assed fix to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not feasible but I wouldn't rule it out one day.
I'm confused by this. Do we prefer
(room_id, event_id)
or not?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the entire discussion here, and I don't think we have a clear conclusion. Personally, I don't really see the point in including
room_id
in the constraint, but mostly I'd rather we have a discussion on it than just merge a PR which takes one particular view, and justifies it using questionable arguments.(room_id, event_id)
because of hash collisions"(room_id, event_id)
is to avoid hash collisions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was discussed in the backend chapter sync as well which also brought up #13771.
This PR captures the tribal knowledge you mentioned in,
@reivilibre's number investigation is a good enough to disprove the sheer chance that a client and server run into a collision. I'm less convinced there won't be a way to exploit things in the future (targeted attack) but we can update this part of the doc to not call it out as much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, we probably need to discuss this further when I'm back from leave. #12892 moves in exactly the opposite direction to that suggested here.