views:

34

answers:

2

When you have peaks of 600 requests/second, then the memcache flushing an item due to the TTL expiring has some pretty negative effects. At almost the same time, 200 threads/processes find the cache empty and fire of a DB request to fill it up again

What is the best practice to deal with these situations?

p.s. what is the term for this situation? (gives me chance to get better google results on the topic)

+3  A: 

If you have memcached objects which will be needed on a large number of requests (which you imply is the case), then I would look into having a separate process or cron job that regularly calculated and refreshed these objects. That way it should never go TTL. It's a common trade-off: you add a little unnecessary load during low traffic time to help reduce the load during peaking (the time you probably care the most about).

I found out this is referred to as "stampeding herd" by the memcached folks, and they discuss it here: http://code.google.com/p/memcached/wiki/NewProgrammingTricks#Avoiding_stampeding_herd

My next suggestion was actually going to be using soft cache limits as discussed in the link above.

integer
The situation is not always as bleak as I sketched it out to be. Some lesser cached items have less db requests when they expire, but the same effect (although on smaller scales) keeps existing. It would be a nice option if memcached could for instance give one request a few seconds ahead start before the other requests get the same cache miss. (Although that would probably give some other issues)
Toad
In response to your comment I was going to suggest soft cache limits but while searching for the name for that (slipped my mind), I found the memcached people talking about your problem and outlining several suggestions, so I updated the answer with that.
integer
Good to read there's an actual term for it. +1
Toad
@integer: Thank you for your informative answer. I learned a lot from that. +1.
Carl Smotricz
A: 

If your object is expiring because you've set an expiry and it's gone past date, there is nothing you can do but increase the expiry time.

If you are worried about stale data, a few techniques exist you could consider:

  • Consider making the cache the authoritative source for whatever data you are looking at, and make a thread whose job is to keep it fresh. This will make the other threads block on refilling the cache, so it may only make sense if you can

  • Rather than setting a TTL on the data, change whatever process updates the data to update the cache. One technique I use for frequently changing data is to do this probabilistically -- 10% of the time data is written, it is updated. You can tune this for whatever is sensible, depending on how expensive the DB query is and how severe the impact of stale data.

owenmarshall