views:

583

answers:

3

I have a rather small (ca. 4.5k pageviews a day) website running on Django, with PostgreSQL 8.3 as the db.

I am using the database as both the cache and the sesssion backend. I've heard a lot of good things about using Memcached for this purpose, and I would definitely like to give it a try. However, I would like to know exactly what would be the benefits of such a change: I imagine that my site may be just not big enough for the better cache backend to make a difference. The point is: it wouldn't be me who would be installing and configuring memcached, and I don't want to waste somebody's time for nothing or very little.

How can I measure the overhead introduced by using the db as the cache backend? I've looked at django-debug-toolbar, but if I understand correctly it isn't something you'd like to put on a production site (you have to set DEBUG=True for it to work). Unfortunately, I cannot quite reproduce the production setting on my laptop (I have a different OS, CPU and a lot more RAM).

Has anyone benchmarked different Django cache/session backends? Does anybody know what would be the performance difference if I was doing, for example, one session-write on every request?

+1  A: 

Short answer : If you have enougth ram, memcached will be always faster. You can't really benchhmark memcached vs. database cache, just keep in mind that the big bottleneck with servers is disk access, specially write access.

Anyway, disk cache is better if you have many objects to cache and long time expiration. But for this situation, if you want gig performances, it is better to generate your pages statically with a python script and deliver them with ligthtpd or nginx.

For memcached, you could adjust the amount of ram dedicated to the server.

fredz
I *know* that using memcache is going to be faster than the db. The question is if it is going to be faster *enough* to justify changing the configuration.
Ryszard Szopa
+3  A: 

At my previous work we tried to measure caching impact on site we was developing. On the same machine we load-tested the set of 10 pages that are most commonly used as start pages (object listings), plus some object detail pages taken randomly from the pool of ~200000. The difference was like 150 requests/second to 30000 requests/second and the database queries dropped to 1-2 per page.

What was cached:

  • sessions
  • lists of objects retrieved for each individual page in object listing
  • secondary objects and common content (found on each page)
  • lists of object categories and other categorising properties
  • object counters (calculated offline by cron job)
  • individual objects

In general, we used only low-level granular caching, not the high-level cache framework. It required very careful design (cache had to be properly invalidated upon each database state change, like adding or modifying any object).

zgoda
A: 

Just try it out. Use firebug or a similar tool and run memcache with a bit of RAM allocation (e.g. 64mb) on the test server.

Mark your average loading results seen in firebug without memcache, then turn caching on and mark new results. That's done as easy as it said.

The results usually shocks people, because the perfomance raises up very nicely.

DataGreed