Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding caching #1020

Open
pedrokost opened this issue Jul 26, 2015 · 4 comments
Open

Understanding caching #1020

pedrokost opened this issue Jul 26, 2015 · 4 comments

Comments

@pedrokost
Copy link

Issue: caching actually slowing down requests when response has many objects.

Without caching, I can retrieve a set of records in about 10ms:

10:22:26 log.1  | Started GET "/klubs?category=karate" for 127.0.0.1 at 2015-07-26 10:22:26 +0000
10:22:26 log.1  | Processing by Api::V1::KlubsController#index as */*
10:22:26 log.1  |   Parameters: {"category"=>"karate", "subdomain"=>"api"}
10:22:26 log.1  |   Klub Load (0.7ms)  SELECT  "klubs".* FROM "klubs" WHERE "klubs"."complete" = 't' AND ('karate' = ANY (categories)) LIMIT 30
10:22:26 log.1  | Completed 200 OK in 9ms (Views: 7.3ms | ActiveRecord: 0.7ms)

When I activate caching, the first request is somewhat slower as it needs to warm up the cache:

10:22:53 web.1  | 127.0.0.1 - - [26/Jul/2015:10:22:53 +0000] "GET /klubs?category=karate HTTP/1.1" 200 - 0.0609
10:22:54 log.1  | 
10:22:54 log.1  | 
10:22:54 log.1  | Started GET "/klubs?category=karate" for 127.0.0.1 at 2015-07-26 10:22:53 +0000
10:22:54 log.1  | Processing by Api::V1::KlubsController#index as */*
10:22:54 log.1  |   Parameters: {"category"=>"karate", "subdomain"=>"api"}
10:22:54 log.1  |   Klub Load (0.5ms)  SELECT  "klubs".* FROM "klubs" WHERE "klubs"."complete" = 't' AND ('karate' = ANY (categories)) LIMIT 30
10:22:54 log.1  | Cache read: klubs/274-20150110223731768598000/be1391c76bdddc3ce74e4e205a0357ff
10:22:54 log.1  | Cache generate: klubs/274-20150110223731768598000/be1391c76bdddc3ce74e4e205a0357ff
10:22:54 log.1  | Cache write: klubs/274-20150110223731768598000/be1391c76bdddc3ce74e4e205a0357ff
...
10:22:54 log.1  | Completed 200 OK in 39ms (Views: 33.4ms | ActiveRecord: 1.7ms)

Subsequent request are faster, but still slower than without caching:

10:23:50 log.1  | Started GET "/klubs?category=karate" for 127.0.0.1 at 2015-07-26 10:23:49 +0000
10:23:50 log.1  | Processing by Api::V1::KlubsController#index as */*
10:23:50 log.1  |   Parameters: {"category"=>"karate", "subdomain"=>"api"}
10:23:50 log.1  |   Klub Load (0.5ms)  SELECT  "klubs".* FROM "klubs" WHERE "klubs"."complete" = 't' AND ('karate' = ANY (categories)) LIMIT 30
10:23:50 log.1  | Cache read: klubs/274-20150110223731768598000/be1391c76bdddc3ce74e4e205a0357ff
10:23:50 log.1  | Cache fetch_hit: klubs/274-20150110223731768598000/be1391c76bdddc3ce74e4e205a0357ff
...
10:23:50 log.1  | Completed 200 OK in 28ms (Views: 23.8ms | ActiveRecord: 2.0ms)

My understanding is that the reduction in speed is likely due by the observation that querying the cache store (dalli) is actually slower than generating the simple JSON for each object.

The more objects I have, the more sequential accesses to the cache store are made which slow things down even more compared to no caching.

I was reading about Russion Doll Caching and I think I could get a huge response time reduction by caching the generated JSON of all objects together as well as individual objects. I was quite surprised that AMS does not seem to do this.

My action currently holds this:

 def index
      klubs = Klub.where("? = ANY (categories)", category_param).limit(30)
      render json: klubs, root: 'klubs'
end

and my serializer is:

class KlubSerializer < ActiveModel::Serializer
  cache
  attributes :id, :name, :address, :email, :latitude, :longitude, :phone, :town, :website, :slug, :facebook_url, :categories

end

I am able to greatly speed up the responses with low level caching:

data = Rails.cache.fetch("klubs/#{category_param}-#{klubs.count}-#{klubs.map(&:updated_at).max.to_i}") do
          klubs.to_json
end

render json: data

But I feel that this could be a feature of AMS.

@joaomdmoura
Copy link
Member

Hi @pedrokost, indeed, we are aware of this, there are two implementation we will do to decrease the response time.

1 - Russion Doll Caching (as you mentioned)
2 - Implemente fetch_multi that enables us to fetch multiples entries with a single query

Our plan is to have both on 0.10.x

@joaomdmoura joaomdmoura self-assigned this Jul 31, 2015
@joaomdmoura joaomdmoura added this to the 0.10 milestone Jul 31, 2015
@kulte
Copy link

kulte commented Aug 7, 2015

@joaomdmoura Hey man, this is a huge bottleneck for our new API. How can we help!?

@joaomdmoura
Copy link
Member

Hey @kulte we are supposed to start working on this in a couple of weeks, but if you feel that you could handle it you could starting opening an issue explaining better the implementation we want to implement and maybe some examples or anything that would be worth sharing before start working on that for real.
Then we can speed up the process by gathering other ppl thought about it, then depending on how it goes and if you feel comfortable if that you could start the implementation or wait for us or someone else take it, maybe pair, idk. But the starting points would be create the issues.
Meanwhile let us know if there is something we can do in order to help you and your team.
It will be on 0.10 version.

@bf4
Copy link
Member

bf4 commented Mar 13, 2016

@pedrokost I just made a pretty exhaustive issue around this in #1586. Unfortunately, I either didn't read this issue or didn't read it carefully. I've now labeled it as a bug. read_multi was added in 1372

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants