-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement simple moka cache for all dereference() calls #55
Conversation
This looks like it would essentially be a cache for database reads, as network calls are made at most once every 24h due to So i think its better to set the right cache-control headers on lemmy http endpoints, and allow admins to configure caching in nginx. |
It also fixes the race condition when two lookups happen at the same time. That happened a lot for me when perf testing, not sure how much it happens in prod. I'm working on some benchmarks that I'll post here. I think it could definitely make an impact considering that when I test by receiving 100k incoming federation events lemmy does 11170454 database queries! (>100 per event)! |
That sounds really extreme. Can you tell what its reading from the db so much? |
Yes. Check the "Top Queries by call count" here: https://github.com/phiresky/lemmy-perf-data/blob/main/comparisons/2023-07-03-pg-synchronous-commit.md. It's a different experiment but it's the state of lemmy main branch. Both the It's less than the number I quoted above because previously i uploaded 100k post/comments, now it's 70k votes and 30k post/comment (more realistic). I'll do the experiment to compare the impact of this caching PR later. |
Okay it looks like the votes in particular do a lot of database reads. Also the inbox needs to read actor keys from db. I was thinking that it might be possible to optimize these things in the code without relying on caching, but that looks very tricky. The current cache time of 10s is very short though and wont make any noticable difference. I think one hour would be totally fine. Cache capacity could also be higher, but it depends on memory usage. Im all for simplifying the trait bounds. Edit: I also added some caching in Lemmy for blocklist and dead instance list reads: LemmyNet/lemmy#3427 (comment) |
#[serde(untagged)] | ||
pub enum SearchableObjects { | ||
Person(Person), | ||
Note(Note) | ||
} | ||
|
||
#[derive(Debug, Clone)] | ||
pub struct Error(pub(crate) Arc<anyhow::Error>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave a comment why this is necessary.
Kind: Object + Send + Sync + Clone + 'static, | ||
<Kind as Object>::Error: Clone + Send + Sync, | ||
{ | ||
static CACHES: OnceLock<RwLock<type_map::concurrent::TypeMap>> = OnceLock::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really need two different types of lock? Smells like premature optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that
- OnceLock::get_or_init() returns an immutable reference to the inner value. It only cares about making sure the initializer is called only once.
- TypeMap is basically just a HashMap, which cannot be written to concurrently. (the concurrent in the package name just refers to the inner bounds of the value type of the typemap.
- RwLock allows writing to the TypeMap via reference.
Honestly I just spent a while trying to figure out the best way to do this in general, I'd like it to not need a typemap at all (somehow abuse monomorphization to create one moka cache per type) but couldn't figure out a better solution.
} else { | ||
// The Cache.get_with method ensures that the dereference_inner method is only called once even if the dereference method is called twice simultaneously. | ||
// From the moka docs: "This method guarantees that concurrent calls on the same not-existing key are coalesced into one evaluation of the init future. Only one of the calls evaluates its future, and other calls wait for that future to resolve." | ||
// Considerations: should an error result be stored in the cache as well? Right now: yes. Otherwise try_get_with should be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think it makes sense to cache error values. I would expect errors to be temporary and retried the next time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also nitpick, your comment lines are too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think it makes sense to cache error values. I would expect errors to be temporary and retried the next time.
The main reason I cached them is that failed lookups are very common and "intentional" in lemmy. For example, when inserting a comment it first looks up if the corresponding parent exist, and if it doesn't it tries looking that up which might permanently fail. That lookup failing is a fairly hot path and sending out spamming http requests is not great.
Tbh, I can't think of good example right now where these errors are something that would happen "expectedly", maybe e.g. if some instance is blacklisted so their content is not inserted, but I think it does exist.
In general, I think lemmy needs to start distinguishing between ephemeral errors and permanent ones. Right now the code eats up lots of errors that should be passed up or ignored (e.g. parsing error vs 404 error vs DNS lookup error).
/// Defines how long objects of this type should live in the in-memory cache | ||
fn cache_time_to_live() -> Duration { | ||
Duration::from_secs(600) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes more sense to move this to the FederationConfig. Should also be disabled in debug mode imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually did this intentionally in the trait so the configuration can depend on the object. For example, it makes sense to cache a lot of communities and posts and cache them for a long time while caching Vote objects is probably not very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Votes dont implement the Object trait so they wont get cached either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. But still, does it make sense what I wrote? It probably makes sense to have pretty much all communities in the cache but only the most recent 1000 posts or so, while caching comments potentially is not too useful (since those are mostly queried in batches or together with a post)
I agree that it should probably also somehow be configurable by user though, or somehow dependent on system memory.
I think you might be right after all that it makes more sense to do this thing closer to the database. When benchmarking this I'm seeing around ~30% reduced DB queries, but since a lot of the lemmy code does the exact same lookups but not going through the .dereference() function. So it might make more sense to do this whole thing around the Crud trait or similar, that way all the code can benefit. Also means it can catch all db updates immediately and update the cache. I'll look later and maybe open a PR against lemmy itself. |
Yes a general cache for database reads might be good. I wonder how that could be done. |
I'll look If i can make a "simple" draft PR against lemmy for that. Probably very similar to this one. |
This PR caches the dereference() function in memory (so it caches both DB fetches as well as HTTP fetches). The cache is separate per object type is both limited by a time to live (default 10s) and by a object count (default 1000).
Since dereference() is a core mechanic of lemmy, this should improve performance of both normal api calls as well as incoming federation calls.
The main pain is that right now it means that both Object and the Object::Error type need to be Send + Sync + Clone (since they are stored in the cache and can thus be accessed from multiple places simultaneously). Send + Sync isn't causing any issues, but since anyhow::Error is not Clone, this requires some really annoying changes in the examples and making LemmyError Clone (by storing the inner anyhow::Error in an Arc).
The cache also prevents multiple concurrent db fetches and multiple concurrent HTTP requests to the same object as well as race conditions in other code resulting from it.
The following I'm still planning todo:
Open questions:
receive_activity
function.