views:

2804

answers:

7

Hi,

Memcached provides a cache expiration time option, which specifies how long objects are retained in the cache. Assuming all writes are through the cache I fail to understand why one would ever want to remove an object from the cache. In other words, if all write operations update the cache before the DB, then the cache can never contain a stale object, so why remove it?

One possible argument is that the cache will grow indefinitely if objects are never removed, but memcached allows you to specify a maximum size. Once this size is reached, memcached uses a least-recently-used (LRU) algorithm to determine which items to remove. To summarise, if a sensible maximum size has been configured, and all writes are through the cache, why would you want to expire objects after a certain length of time?

Thanks, Don

A: 

I'd say it is about the distinction between 'Least Recently Used' and 'Not Gonna Be Used Anymore' ... if you can indicate explicitly which objects can be taken out of the cache, that leaves more room for objects that may still be used later.

jerryjvl
I've read claims that sometimes unexpired keys may actually be removed from the cache before expired keys -- this being a sacrifice made to keep the LRU cache purge algorithm efficient. In other words, there *isn't* a distinction between 'Least Recently Used' and 'Not Gonna Be Used Anymore'. Expired keys aren't actually removed when they expire -- but they *will* be purged the next time a get request comes in for them. So, in short, setting expirations doesn't actually necessarily help "leave more room" for unexpired keys.
Frank Farmer
+4  A: 

Expiration times are useful when you don't need precise information, you just want it to be accurate to within a certain time. So you cache your data for (say) five minutes. When the data is needed, check the cache. If it's there, use it. If not (because it expired), then go and compute the value anew.

Some cached values are based on a large set of data, and invalidating the cache or writing new values to it is impractical. This is often true of summary data, or data computed from a large set of original data.

Ned Batchelder
+1  A: 

One case would be where a value is only valid for a certain period of time.

objects
+1  A: 

Some data in cache is expensive to create but small (should last a long time) and some is large but relatively cheap (should last a shorter time)

Also, for most applications it is hard to make memcached work as a write through cache. It is difficult to properly invalidate all caches, especially those of rendered pages. Most users will miss a couple.

Tom Leys
+1  A: 

I was curious about this myself, when I first started working with memcached. We asked friends who worked at hi5 and facebook (both heavy users of memcached).

They both said that they generally use something like a 3 hour default expire time as sort of a "just in case".

  1. For most objects, it's not that expensive to rebuild them every 3 hours
  2. On the off chance you've got some bug that causes things to stay cached that shouldn't otherwise, this can keep you from getting into too much trouble

So I guess the answer to the question "Why?" is really, "Why not?". It doesn't cost you much to have an expiration in there, and it will probably only help ensure you're not keeping stale data in the cache.

Frank Farmer
A: 

Hi!

Can you make memcached "reload" the expiration time automatically when object is loaded from the cache? For example, as long as the information is being actively retrieved from the cache, memcache would automatically reset the expiration time so that the object would stay cached as long as the retrieval interval is under 10mins.

Tomi
A: 

If your design calls for a write-through cache, you still have an issue with coming up against the memory limit allocated to memcached which is where LRU comes into play.

LRU has two rules when determining what to kick out, and does so in the following order:

  1. Expired slabs
  2. Oldest unused slab

Providing different expiration dates for different groups of objects can help keep less-frequently accessed data that is more expensive to cache in memory while allowing more frequently used slabs that might still find their way to the end of the queue, but are easy to recreate, to expire.

It is also the case that many cache keys wind up becoming aggregates of other objects, and unless you employ a lookup hash for those objects, it's much easier to just let the objects expire after a few hours than to proactively update all the associated keys, and it also preserves the hit/miss ratio that you are effectively vying for by using memcached in the first place.

David O.