implement simple moka cache for all dereference() calls #55

phiresky · 2023-07-01T00:54:01Z

This PR caches the dereference() function in memory (so it caches both DB fetches as well as HTTP fetches). The cache is separate per object type is both limited by a time to live (default 10s) and by a object count (default 1000).

Since dereference() is a core mechanic of lemmy, this should improve performance of both normal api calls as well as incoming federation calls.

The main pain is that right now it means that both Object and the Object::Error type need to be Send + Sync + Clone (since they are stored in the cache and can thus be accessed from multiple places simultaneously). Send + Sync isn't causing any issues, but since anyhow::Error is not Clone, this requires some really annoying changes in the examples and making LemmyError Clone (by storing the inner anyhow::Error in an Arc).

The cache also prevents multiple concurrent db fetches and multiple concurrent HTTP requests to the same object as well as race conditions in other code resulting from it.

The following I'm still planning todo:

some testing
performance benchmarking (i was planning to do a simple comparison here)

Open questions:

any approval / concerns with this approach
should there be defaults for the cache count and duration? maybe not.
what should the cache count and TTL duration values be for lemmy? are there places where this caching causes issue?
There's now a ton of trait bounds that are passed around in 10 different places here and another 10 places in lemmy. Should just move all of those to the Object trait? They are only required for the dereference function (as before), but idk if there's actually any point to not having them on the main trait. Just look at that signature of the receive_activity function.

Nutomic · 2023-07-03T13:55:40Z

This looks like it would essentially be a cache for database reads, as network calls are made at most once every 24h due to last_refreshed_at() check. So Im sceptical that this will have much of a performance impact, as these db queries are very simple (select a single db row from indexed string column, no joins, filters or anything).

So i think its better to set the right cache-control headers on lemmy http endpoints, and allow admins to configure caching in nginx.

phiresky · 2023-07-03T13:59:27Z

as network calls are made at most once every 24h due to last_refreshed_at()

It also fixes the race condition when two lookups happen at the same time. That happened a lot for me when perf testing, not sure how much it happens in prod.

I'm working on some benchmarks that I'll post here. I think it could definitely make an impact considering that when I test by receiving 100k incoming federation events lemmy does 11170454 database queries! (>100 per event)!

Nutomic · 2023-07-03T14:40:41Z

That sounds really extreme. Can you tell what its reading from the db so much?

phiresky · 2023-07-03T18:10:29Z

Yes. Check the "Top Queries by call count" here: https://github.com/phiresky/lemmy-perf-data/blob/main/comparisons/2023-07-03-pg-synchronous-commit.md. It's a different experiment but it's the state of lemmy main branch. Both the select * from person where actor_id=X and the select * from post where ap_id=X queries seem to be called around three times for every incoming fed event (vote, comment, post).

It's less than the number I quoted above because previously i uploaded 100k post/comments, now it's 70k votes and 30k post/comment (more realistic).

I'll do the experiment to compare the impact of this caching PR later.

Nutomic · 2023-07-04T09:57:02Z

Okay it looks like the votes in particular do a lot of database reads. Also the inbox needs to read actor keys from db. I was thinking that it might be possible to optimize these things in the code without relying on caching, but that looks very tricky.

The current cache time of 10s is very short though and wont make any noticable difference. I think one hour would be totally fine. Cache capacity could also be higher, but it depends on memory usage.

Im all for simplifying the trait bounds.

Edit: I also added some caching in Lemmy for blocklist and dead instance list reads: LemmyNet/lemmy#3427 (comment)

Nutomic · 2023-07-04T09:57:56Z

docs/10_fetching_objects_with_unknown_type.md

 #[serde(untagged)]
 pub enum SearchableObjects {
    Person(Person),
    Note(Note)
 }

+#[derive(Debug, Clone)]
+pub struct Error(pub(crate) Arc<anyhow::Error>);


Leave a comment why this is necessary.

Nutomic · 2023-07-04T09:59:08Z

src/fetch/object_id.rs

+    Kind: Object + Send + Sync + Clone + 'static,
+    <Kind as Object>::Error: Clone + Send + Sync,
+{
+    static CACHES: OnceLock<RwLock<type_map::concurrent::TypeMap>> = OnceLock::new();


Does this really need two different types of lock? Smells like premature optimization.

The reason is that

OnceLock::get_or_init() returns an immutable reference to the inner value. It only cares about making sure the initializer is called only once.

TypeMap is basically just a HashMap, which cannot be written to concurrently. (the concurrent in the package name just refers to the inner bounds of the value type of the typemap.

RwLock allows writing to the TypeMap via reference.

Honestly I just spent a while trying to figure out the best way to do this in general, I'd like it to not need a typemap at all (somehow abuse monomorphization to create one moka cache per type) but couldn't figure out a better solution.

Nutomic · 2023-07-04T10:02:01Z

src/fetch/object_id.rs

+        } else {
+            // The Cache.get_with method ensures that the dereference_inner method is only called once even if the dereference method is called twice simultaneously.
+            // From the moka docs: "This method guarantees that concurrent calls on the same not-existing key are coalesced into one evaluation of the init future. Only one of the calls evaluates its future, and other calls wait for that future to resolve."
+            // Considerations: should an error result be stored in the cache as well? Right now: yes. Otherwise try_get_with should be used.


I dont think it makes sense to cache error values. I would expect errors to be temporary and retried the next time.

Also nitpick, your comment lines are too long.

I dont think it makes sense to cache error values. I would expect errors to be temporary and retried the next time.

The main reason I cached them is that failed lookups are very common and "intentional" in lemmy. For example, when inserting a comment it first looks up if the corresponding parent exist, and if it doesn't it tries looking that up which might permanently fail. That lookup failing is a fairly hot path and sending out spamming http requests is not great.

Tbh, I can't think of good example right now where these errors are something that would happen "expectedly", maybe e.g. if some instance is blacklisted so their content is not inserted, but I think it does exist.

In general, I think lemmy needs to start distinguishing between ephemeral errors and permanent ones. Right now the code eats up lots of errors that should be passed up or ignored (e.g. parsing error vs 404 error vs DNS lookup error).

Nutomic · 2023-07-04T10:03:13Z

src/traits.rs

+    /// Defines how long objects of this type should live in the in-memory cache
+    fn cache_time_to_live() -> Duration {
+        Duration::from_secs(600)
+    }


Makes more sense to move this to the FederationConfig. Should also be disabled in debug mode imo.

I actually did this intentionally in the trait so the configuration can depend on the object. For example, it makes sense to cache a lot of communities and posts and cache them for a long time while caching Vote objects is probably not very useful.

Votes dont implement the Object trait so they wont get cached either way.

Ah, ok. But still, does it make sense what I wrote? It probably makes sense to have pretty much all communities in the cache but only the most recent 1000 posts or so, while caching comments potentially is not too useful (since those are mostly queried in batches or together with a post)

I agree that it should probably also somehow be configurable by user though, or somehow dependent on system memory.

phiresky · 2023-07-04T10:59:03Z

This looks like it would essentially be a cache for database reads

I think you might be right after all that it makes more sense to do this thing closer to the database. When benchmarking this I'm seeing around ~30% reduced DB queries, but since a lot of the lemmy code does the exact same lookups but not going through the .dereference() function.

So it might make more sense to do this whole thing around the Crud trait or similar, that way all the code can benefit. Also means it can catch all db updates immediately and update the cache. I'll look later and maybe open a PR against lemmy itself.

Nutomic · 2023-07-04T11:24:06Z

Yes a general cache for database reads might be good. I wonder how that could be done.

phiresky · 2023-07-04T11:25:33Z

I'll look If i can make a "simple" draft PR against lemmy for that. Probably very similar to this one.

phiresky added 3 commits June 30, 2023 15:07

add a separate allow_http flag

7be3347

implement simple moka cache for all dereference() calls

f9548ce

Merge branch 'allow-http-configurable' into memory-caching

86f6624

actually fix cache

fcbe058

Nutomic reviewed Jul 4, 2023

View reviewed changes

Nutomic closed this Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement simple moka cache for all dereference() calls #55

implement simple moka cache for all dereference() calls #55

phiresky commented Jul 1, 2023 •

edited

Loading

Nutomic commented Jul 3, 2023

phiresky commented Jul 3, 2023

Nutomic commented Jul 3, 2023

phiresky commented Jul 3, 2023

Nutomic commented Jul 4, 2023 •

edited

Loading

Nutomic Jul 4, 2023

Nutomic Jul 4, 2023 •

edited

Loading

phiresky Jul 4, 2023

Nutomic Jul 4, 2023

Nutomic Jul 4, 2023

phiresky Jul 4, 2023

Nutomic Jul 4, 2023

phiresky Jul 4, 2023

Nutomic Jul 4, 2023

phiresky Jul 4, 2023

phiresky commented Jul 4, 2023

Nutomic commented Jul 4, 2023

phiresky commented Jul 4, 2023

implement simple moka cache for all dereference() calls #55

implement simple moka cache for all dereference() calls #55

Conversation

phiresky commented Jul 1, 2023 • edited Loading

Nutomic commented Jul 3, 2023

phiresky commented Jul 3, 2023

Nutomic commented Jul 3, 2023

phiresky commented Jul 3, 2023

Nutomic commented Jul 4, 2023 • edited Loading

Choose a reason for hiding this comment

Nutomic Jul 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phiresky commented Jul 4, 2023

Nutomic commented Jul 4, 2023

phiresky commented Jul 4, 2023

phiresky commented Jul 1, 2023 •

edited

Loading

Nutomic commented Jul 4, 2023 •

edited

Loading

Nutomic Jul 4, 2023 •

edited

Loading