views:

234

answers:

3

I'm building an application with multiple server involved. (4 servers where each one has a database and a webserver. 1 master database and 3 slaves + one load balancer)

There is several approach to enable caching. Right now it's fairly simple and not efficient at all. All the caching is done on an NFS partition share between all servers. NFS is the bottleneck in the architecture.

  1. I have several ideas implement caching. It can be done on a server level (local file system) but the problem is to invalidate a cache file when the content has been update on all server : It can be done by having a small cache lifetime (not efficient because the cache will be refresh sooner that it should be most of the time)
  2. It can also be done by a messaging sytem (XMPP for example) where each server communicate with each other. The server responsible for the invalidation of the cache send a request to all the other to let them know that the cache has been invalidated. Latency is probably bigger (take more time for everybody to know that the cache has been invalidated) but my application doesn't require atomic cache invalidation.
  3. Third approach is to use a cloud system to store the cache (like CouchDB) but I have no idea of the performance for this one. Is it faster than using a SQL database?

I planned to use Zend Framework but I don't think it's really relevant (except that some package probably exists in other Framework to deal with XMPP, CouchDB)

Requirements: Persistent cache (if a server restart, the cache shouldn't be lost to avoid bringing down the server while re-creating the cache)

+4  A: 

http://www.danga.com/memcached/

Memcached covers most of the requirements you lay out - message-based read, commit and invalidation. High availability and high speed, but very little atomic reliability (sacrificed for performance).

(Also, memcached powers things like YouTube, Wikipedia, Facebook, so I think it can be fairly well-established that organizations with the time, money and talent to seriously evaluate many distributed caching options settle with memcached!)

Edit (in response to comment) The idea of a cache is for it to be relatively transitory compared to your backing store. If you need to persist the cache data long-term, I recommend looking at either (a) denormalizing your data tier to get more performance, or (b) adding a middle-tier database server that stores high-volume data in straight key-value-pair tables, or something closely approximating that.

Rex M
In fact our cache base will be pretty big (several gigs). We are already using memcached but for all the time used values(username from user_id, ...).I'm looking for a caching system that will be there after a restart. otherwise it will kill the databse on restart.
stunti
Memcached can of course handle many GB of cache data, but you should add to the original question that you need the cache to be persisted. That's a pretty important requirement! See edits.
Rex M
A: 

I think I found a relatively good solution.

I use Zend_Cache to store locally each cache file. I've created a small daemon based on nanoserver which manage cache files locally too. When one server create/modify/delete a cache file locally, it send the same action to all server through the daemon which do the same action.

That mean I have local caching files and remote actions at the same time. Probably not perfect, but should work for now. CouchDB was too slow and NFS is not reliable enough.

stunti
+1  A: 

In defence of memcached as a cache store, if you want high peformance with low impact of a server reboot, why not just have 4 memcached servers? Or 8? Each 'reboot' would have correspondingly less effect on the database server.