Add `BloomTokenLog` #2136

gretchenfrage · 2025-01-26T19:56:56Z

Adds default optional dependency on fastbloom
Adds BloomTokenLog, a new TokenLog implementation:
- At first, it just stores elements in a hash set. In this phase, it experiences linear memory growth and neither false positives nor false negatives.
- When the hash set would consume more memory than a configurable limit, it converts it to a bloom filter. This essentially makes it so that, rather than more elements causing it to consume more memory, more elements cause its false positive rate to keep going up. This way, its memory usage stays constant, and it treats old and newly added elements "fairly" rather than say, discarded new elements or something more questionable.
  - The fastbloom crate is a popular bloom filter implementation with SIMD acceleration.
- The reason why it doesn't just use a bloom filter from the beginning is because then it would consume its maximum memory usage from the time it's first initialized. Overall, this two-phase mechanism means that its memory profile its linear with a ceiling, which avoids making the user pay for what they're not using, and has a false negative rate of zero in both phases, which is perhaps a safer default for users than one that doesn't.
- The overall token log actually maintains two filters at any given point in time, each of which can independently contain elements as described above. It divides time into periods equal to the token lifetime, and stores a filter for each of the two phases in which currently non-expired tokens could expire in. This means the filters get reset over time, which prevents the hash set from consuming memory forever / the bloom filter from getting infinitely saturating and its false positive rate rising to 100%.
- Note on false negatives: Although the bloom token log itself doesn't experience false negatives, a server operationalizing it could experience false negatives if it is using the same token keys across multiple token log instances. This could occur over time (token keys persisted, but token log not persisted, and server restarts) and/or over space (multiple servers behind a load balancer sharing the same token key but each with their own token log). These cases probably wouldn't be security vulnerabilities--RFC 9000 does permit false negatives to occur so long as it's mitigated in some sense, and an attacker which gets to use the servers for amplification attacks only to the extent that there are multiple load-balanced servers / it can get the servers to reboot is probably still acceptably mitigated. But due to this complication I avoid mentioning the lack of false negatives in the documentation to avoid giving the user's a misleading impression.
BloomTokenLog is made the default token log. Also, the default tokens sent is raised from 0 to 2.

- Adds default feature dependency on `fastbloom` crate. - It enables a new BloomTokenLog implementation of TokenLog.

Unless the fastbloom feature is disabled, in which case it still falls back to NoneTokenLog.

It makes sense to do this now that we have a default token log that's actually able to accept tokens.

Tests no longer reconfigure config to use SimpleTokenLog if the fastbloom feature is enabled, instead preferring to use the default BloomTokenLog.

gretchenfrage · 2025-01-27T00:35:35Z

quinn-proto/src/tests/util.rs

-        .validation_token
-        .sent(2)
-        .log(Arc::new(SimpleTokenLog::default()));
+    if cfg!(feature = "fastbloom") {


My bad, I need to invert this.

gretchenfrage · 2025-01-30T23:36:57Z

@djc and @Ralith, some meta-discussion I'd like to explicitly invite, and this applies to #2137 as well: It may be worth evaluating whether we want to go this route in the first place (of adding these defaults). It's possible we decide we want to go a different route.

Some alternatives could include:

We could simply leave Quinn's default TokenLog / TokenStore to be implementations that can't utilize tokens. I could create my own crate with BloomTokenLog / TokenMemoryCache in it.
- We could experiment with me creating them in a crate, and consider merging them in to Quinn later.
We could try to come up with an alternative implementation of these traits which is able to use tokens in a general-purpose context but is simpler than these ones.
We could keep these implementations but try to shrink them a little bit through removing some of my "clever"ness, e.g. removing the "identity hasher."

Points in favor of merging these PRs:

It has the potential to make some Quinn traffic faster "for free"
I believe that these designs should generally avoid degrading performance a notable amount, and that the worst-case performance scenario is basically that they fail to benefit the user
Users can easily disable these

Points against merging these PRs:

It does give the server another worst case 20 MiB memory buffer that could consume RAM
Increased complexity on Quinn for us to maintain
Additional dependency on a third party crate we aren't involved in the creation of maintenance of

I think it's worth putting these in here, but I'd like to hear what you think.

gretchenfrage force-pushed the bloom-token-log branch from f96e7c5 to 9d19830 Compare January 26, 2025 20:04

gretchenfrage mentioned this pull request Jan 26, 2025

Roadmap for NEW_TOKEN utilization #2096

Open

13 tasks

Add BloomTokenLog

78ede4b

- Adds default feature dependency on `fastbloom` crate. - It enables a new BloomTokenLog implementation of TokenLog.

gretchenfrage force-pushed the bloom-token-log branch from 9d19830 to 0b4e414 Compare January 26, 2025 20:07

gretchenfrage added 3 commits January 26, 2025 14:44

Make BloomTokenLog the default TokenLog

8ac9a83

Unless the fastbloom feature is disabled, in which case it still falls back to NoneTokenLog.

Set default tokens sent to 2 if fastbloom enabled

7e0a49e

It makes sense to do this now that we have a default token log that's actually able to accept tokens.

test(proto): Use default BloomTokenLog

d855d1f

Tests no longer reconfigure config to use SimpleTokenLog if the fastbloom feature is enabled, instead preferring to use the default BloomTokenLog.

gretchenfrage force-pushed the bloom-token-log branch from 0b4e414 to d855d1f Compare January 26, 2025 20:44

gretchenfrage marked this pull request as ready for review January 26, 2025 20:52

gretchenfrage requested review from djc and Ralith as code owners January 26, 2025 20:52

gretchenfrage commented Jan 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `BloomTokenLog` #2136

Add `BloomTokenLog` #2136

gretchenfrage commented Jan 26, 2025 •

edited

Loading

gretchenfrage Jan 27, 2025

gretchenfrage commented Jan 30, 2025

Add BloomTokenLog #2136

Are you sure you want to change the base?

Add BloomTokenLog #2136

Conversation

gretchenfrage commented Jan 26, 2025 • edited Loading

gretchenfrage Jan 27, 2025

Choose a reason for hiding this comment

gretchenfrage commented Jan 30, 2025

Add `BloomTokenLog` #2136

Add `BloomTokenLog` #2136

gretchenfrage commented Jan 26, 2025 •

edited

Loading