Reducing MongoDB traffic by 78% with Redis
Posted by Jeff Seibert on Nov 12, 2012 in Engineering, Featured, Insight, TechnologyTL;DR: 31 lines of Rack middleware leverage Redis for highly-performant and flexible response caching.
As Crashlytics has scaled, we’ve always been on the lookout for ways to drastically reduce the load on our systems. We recently brought production Redis servers online for some basic analytics tracking and we’ve been extremely pleased with their performance and stability. This weekend, it was time to give them something a bit more load-intensive to chew on.
The vast majority – roughly 90% – of inbound traffic to our servers is destined for the same place. Our client-side SDK, embedded in apps on hundreds of millions of devices worldwide, periodically loads configuration settings that power many of our advanced features. These settings vary by app and app version, but are otherwise identical across devices – a prime candidate for caching.
There are countless built-in and third-party techniques for Rails caching, but we sought something simple that could leverage the infrastructure we already had. Wouldn’t it be great if we could specify a cache duration in any Rails action and it would “just work”?
1 cache_response_for 10.minutes
Rack Middleware to the Rescue
One of the most powerful features of Rack-based Rails is middleware – functionality you can inject into the request processing logic to adjust how it is handled. This will let us check Redis for a cached response or fall-through to the standard Rails action.
1 class RackRedisCache 2 def initialize(rails) 3 @rails = rails 4 end 5 6 def call(env) 7 cache_key = "rack::redis-cache::#{env['ORIGINAL_FULLPATH']}" 8 9 data = REDIS.hgetall(cache_key) 10 if data['status'] && data['body'] 11 Rails.logger.info "Completed #{data['status'].to_i} from Redis cache" 12 [data['status'].to_i, JSON.parse(data['headers']), [data['body']]] 13 else 14 @rails.call(env).tap do |response| 15 response_status, response_headers, response_body = *response 16 response_cache_duration = response_headers.delete('Rack-Cache-Response-For').to_i 17 18 if response_cache_duration > 0 19 REDIS.hmset(cache_key, 20 'status', response_status, 21 'headers', response_headers.to_json, 22 'body', response_body.body 23 ) 24 25 REDIS.expire(cache_key, response_cache_duration) 26 Rails.logger.info "Cached response to Redis for #{response_cache_duration} seconds." 27 end 28 end 29 end 30 end 31 end
A response in Rails consists of 3 components – the HTTP status, HTTP headers, and of course, the response body. For clarity, we store these under separate keys within a Hash in Redis, JSON-encoding the headers to convert them into a string.
If the cache key is not present, the middleware falls-through to calling the action, and then checking an internal header value to determine whether the action desires its response be cached. The final critical line leverages Redis’ key expiration functionality to ensure the cache is only valid for a given amount of time. It couldn’t get much simpler.
Implementing our DSL
To tie it all together, the ApplicationController needs a simple implementation of cache_response_for that sets the header appropriately:
1 def cache_response_for(duration) 2 headers['Rack-Cache-Response-For'] = duration 3 end
Boom. It was really that easy.
Impact?
This implementation took us only about an hour to develop and deploy, and the effects were immediate. Only 4% of these requests now fall-through to Rails, CPU usage on our API servers has plummeted, and total queries to our MongoDB cluster are down 78%. An hour well-spent. Our Redis cluster also doesn’t sweat its increased responsibility: its CPU usage is up just marginally!
Join Us!
Interested in working on these and other high-scale challenges? We’re hiring! Give us a shout at jobs@crashlytics.com. You can stay up to date with all our progress on Twitter, and Facebook.
