Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BloomTokenLog #2136

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

gretchenfrage
Copy link
Collaborator

@gretchenfrage gretchenfrage commented Jan 26, 2025

  • Adds default optional dependency on fastbloom
  • Adds BloomTokenLog, a new TokenLog implementation:
    • At first, it just stores elements in a hash set. In this phase, it experiences linear memory growth and neither false positives nor false negatives.
    • When the hash set would consume more memory than a configurable limit, it converts it to a bloom filter. This essentially makes it so that, rather than more elements causing it to consume more memory, more elements cause its false positive rate to keep going up. This way, its memory usage stays constant, and it treats old and newly added elements "fairly" rather than say, discarded new elements or something more questionable.
      • The fastbloom crate is a popular bloom filter implementation with SIMD acceleration.
    • The reason why it doesn't just use a bloom filter from the beginning is because then it would consume its maximum memory usage from the time it's first initialized. Overall, this two-phase mechanism means that its memory profile its linear with a ceiling, which avoids making the user pay for what they're not using, and has a false negative rate of zero in both phases, which is perhaps a safer default for users than one that doesn't.
    • The overall token log actually maintains two filters at any given point in time, each of which can independently contain elements as described above. It divides time into periods equal to the token lifetime, and stores a filter for each of the two phases in which currently non-expired tokens could expire in. This means the filters get reset over time, which prevents the hash set from consuming memory forever / the bloom filter from getting infinitely saturating and its false positive rate rising to 100%.
    • Note on false negatives: Although the bloom token log itself doesn't experience false negatives, a server operationalizing it could experience false negatives if it is using the same token keys across multiple token log instances. This could occur over time (token keys persisted, but token log not persisted, and server restarts) and/or over space (multiple servers behind a load balancer sharing the same token key but each with their own token log). These cases probably wouldn't be security vulnerabilities--RFC 9000 does permit false negatives to occur so long as it's mitigated in some sense, and an attacker which gets to use the servers for amplification attacks only to the extent that there are multiple load-balanced servers / it can get the servers to reboot is probably still acceptably mitigated. But due to this complication I avoid mentioning the lack of false negatives in the documentation to avoid giving the user's a misleading impression.
  • BloomTokenLog is made the default token log. Also, the default tokens sent is raised from 0 to 2.

- Adds default feature dependency on `fastbloom` crate.
- It enables a new BloomTokenLog implementation of TokenLog.
Unless the fastbloom feature is disabled, in which case it still falls
back to NoneTokenLog.
It makes sense to do this now that we have a default token log that's
actually able to accept tokens.
Tests no longer reconfigure config to use SimpleTokenLog if the
fastbloom feature is enabled, instead preferring to use the default
BloomTokenLog.
@gretchenfrage gretchenfrage marked this pull request as ready for review January 26, 2025 20:52
.validation_token
.sent(2)
.log(Arc::new(SimpleTokenLog::default()));
if cfg!(feature = "fastbloom") {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I need to invert this.

@gretchenfrage
Copy link
Collaborator Author

@djc and @Ralith, some meta-discussion I'd like to explicitly invite, and this applies to #2137 as well: It may be worth evaluating whether we want to go this route in the first place (of adding these defaults). It's possible we decide we want to go a different route.

Some alternatives could include:

  • We could simply leave Quinn's default TokenLog / TokenStore to be implementations that can't utilize tokens. I could create my own crate with BloomTokenLog / TokenMemoryCache in it.
    • We could experiment with me creating them in a crate, and consider merging them in to Quinn later.
  • We could try to come up with an alternative implementation of these traits which is able to use tokens in a general-purpose context but is simpler than these ones.
  • We could keep these implementations but try to shrink them a little bit through removing some of my "clever"ness, e.g. removing the "identity hasher."

Points in favor of merging these PRs:

  • It has the potential to make some Quinn traffic faster "for free"
  • I believe that these designs should generally avoid degrading performance a notable amount, and that the worst-case performance scenario is basically that they fail to benefit the user
  • Users can easily disable these

Points against merging these PRs:

  • It does give the server another worst case 20 MiB memory buffer that could consume RAM
  • Increased complexity on Quinn for us to maintain
  • Additional dependency on a third party crate we aren't involved in the creation of maintenance of

I think it's worth putting these in here, but I'd like to hear what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant