views:

376

answers:

8

I have a web site using apache httpd as the server and mysql as the backend. It publishes a "thought for the day" that has gotten so popular that the server is crashing due to the number of requests. Since the same page is been requested (the thought only changes once a day), is it possible to put a caching server in front of my main server, so that when the same request is made by different clients, the caching server returns the page without having to go to the database?

+2  A: 

Absolutely. There are many products that will work well for this. Apache itself can be configured to function this way although, if you are on Linux or UNIX, Squid is the better option as it is specifically designed to do this very job.

On Windows, MS has always offered cache/proxy products that will do this function. Currently this is ISA Server 2006. Although this is dramatically overkill for this type of application.

Squid is my recommendation.

Scott Alan Miller
+2  A: 

Yes. You are talking about a reverse proxy (or "http accelerator" which is an imprecise term for the same thing). It can be very efficient, and very many high throughput sites use the technique.

The key element to get right is the caching-related HTTP headers. So I strongly recommend reading the HTTP RFC (it can actually be done). If you don't get headers right, you can have little effect, or maybe even security problems (if personalized pages are cached and presented to the wrong people).

Also: You may have to split up your page into parts, to gain best caching effect. Example: If you insist of having a clock in a corner of your pages showing current server time down to the second, then the whole page becomes cacheable for only a second. So 1) drop the stupid clock, or 2) have it be generated by a client-side script - or 3) have the client side script pull that particular part of the page from a special URL which then only outputs a small ever-changing, non-cacheable HTML fragment.

I've once used Squid as a reverse proxy for a large web site. Nowadays, if I were to do it again, I'd try out Varnish.

Troels Arvin
A: 

You could also try memcached. That's what my company uses and I think LiveJournal uses it too. It caches DB requests and makes a serious dent in DB access.

Evil Andy
+12  A: 

For slow changing pages, a cache will definitely reduce CPU usage; but in your extreme case, where the page changes once a day, and it's perfectly predictable, it would be far easier to use a simple and fast static file server (lighthttp, nginx, etc) and a cron job to change your "thought of the day" every night.

In fact, a lot of non-interactive web pages can be done this way: periodically rebuild html files from database or any other source, and use simple, fast static web servers.

Javier
+2  A: 

I would definitely recommend Javier's solution, which is the simplest, most robust, and easiest to maintain. Just don't forget to send the proper Expires header 24 hours into the future and set ETags properly.

Mihai Limbășan
+1  A: 

If your "thought for the day" page never change except once a day, maybe the simple thing to do is to launch something like that once a day

wget http://your_site/your_page.php -O /var/www/your_site_directory/your_page.html

(and change links to this page from your_page.php to your_page.html)

Then you will reduce the load on your apache server AND your SQL server...

sebthebert
A: 

I could't agree more with Javier's suggestion (generate a static web page). I just want to add one remark to clarify it a little:

Store the static file as ".html", not ".php" or whatever language is used to pull the data from DB. Using static files is much faster than starting up an parser or executable. Static files (HTML, GIF, ...) are just put through to the network while scripts, CGIs and all other things are started, parsed, executed and whatever else... That will require much more server ressources than real static files.

BlaM
A: 

A static file just has the I/O as overhead. Objects cached in memory are great but you still have the overhead of managing those objects, and with heavy usage this becomes tricky. Hence the ease and beauty of static files.

Another benefit is that you can have processes that are NOT a part of web server threads perform updates and maintenance. If you update service locks, you will not lock up your web server.

David Robbins