tags:

views:

290

answers:

2

Some questions about memcached

I would like to implement memcache on my social network site, being a social network most data changes very frequently, For example if I were to store a users 10,000 friends into cache, anytime he adds a friend cache would need to update, easy enough but also it would need to update anytime someone else added them as a friend, thats a lot of updating just on friend list alone

There is also user blogs and bulletins which are posted non stop with new ones and you can only see the ones that are created by a user in your friend list, so I think this would be very hard to cache.

SO a question now, I could see possibly caching some profile info that only changes on user updates profile, but this would create a cache record for every user, if there are 100,000+ users thats a lot of caching, is this possible or not a good idea?

ALso any tutorials on memcache you recomend?

+2  A: 

Memcached has a really special architecture. In order to expire a cache, the best way is to store data using an "indexed" key.

I recommend you to watch Gregg Pollack's Memcached screenscast. It's focused on using Memcached with a Rails project but provides a comprehensive overview as well about Memcached best practice.

Simone Carletti
+1  A: 

I would say that it is a good idea to cache where possible.... most of the time you will be able to pull items from memcached (especially if you have complex joins and such) faster than a traditional RDBMS. I currently employ such a strategy with great success, and here is what i have learned from the experience:

  1. if possible, cache indefinitely, and write a new value when a change is made. try not to do an explicit delete, as you could cause a race condition with multiple concurrent accesses to the data trying to update the cache. also implement locking if an item does not exist in the cache to prevent the above issue (using memcached "add" + short sleep time in a loop)

  2. refresh cache in the background if possible, using a queue. My implementation currently uses a multi-threaded perl processes running in the background + beanstalkd, thus preventing lag time on the frontend. most of the time changes can incur a short lag.

  3. use memcached getmulti if possible, many separate memcached calls really add up.

  4. tier your cache, when checking for an item, check a local array first, then memcached, then db. cache result in the local array after first access to prevent hitting memcached multiple times in a script execution for the same item. EDIT: to clarify, if using a scripted language such as PHP, the local array would live only as long as the current script execution :) an example:

    class Itemcache {
        private $cached_items = array();
        private $memcachedobj;
    
    
    
    public function getitem($memcache_key){
        if(isset($this->cached_items[$memcache_key])){
            return $this->cached_items[$memcache_key];
        }elseif($result = $this->memcachedobj->get($memcache_key)){
            $this->cached_items[$memcache_key] = $result;
            return $result;
        }else{
            // db query here as $dbresult
            $this->memcachedobj->set($memcache_key,$dbresult,0);
            $this->cached_items[$memcache_key] = $dbresult;
            return $dbresult;
    }
    
    }
  5. write a wrapper function that implements the above caching strategy #4.

  6. use a consistent key structure in memcached, eg. 'userinfo_{user.pk}' where user.pk is the primary key of the user in the rdbms.

  7. if your data requires post processing, do this processing where possible BEFORE placing in the cache, will save a few cycles on every hit of that data.

Jason
"cache result in the local array after first access to prevent hitting memcached multiple times in a script execution for the same item." What do you mean by this? Do you mean save an array as like a session or something? Also I am wondering how I could cache a friend list for uses, the problem I see is the user updates and adds friends often and also the other way around where users will add that person as a friend, it cache would need to be updated everytime a friend is added which is very often, also if there are 50,000 users it's ok to cache a friend list for 50,000 users?.next commet->
jasondavis
Some list could have 20,000 friend ID's?
jasondavis
by array, i mean locally. php ex. you have a class var called $array_cache, after grabbing the result from memcached the first time save your result in the array with $this->array_cache[$memcache_key] = $memcache_value;then on next method callif(isset($this->array_cache[$memcache_key])) return $this->array_cache[$memcache_key];so that during the CURRENT script execution it will not need to call memcached again for the same result (if you end up calling for the same item twice or more). the next call of the script would of course call memcached again.-> next comment
Jason
as far as the 50k users are concerned, you may want to benchmark db queries vs. storing it all in memcached, as the resulting array could be rather huge (and could be a substantial drag on the network all together). I would see network transfer time being more of an issue than looking up the item in the array as you could set your array key to the user id and using array_key_exists/isset if using php.-> next comment
Jason
the method i use for friending users is to kick off a refresh of both my and my friend's friends lists in the background at the same time, each doing a db query for friends and storing in the cache.... friends lists on our network are a bit smaller than 50k users, and it works well for our needs.... like i said above, benchmark.btw, really hate how comments are formatted here :(
Jason