views:

191

answers:

4

I'm involved in a project that will end up creating around 10 million new pages on an existing site. The site, and the new project, are built with CodeIgniter and connecting to MySQL.

I've never dealt with a site of this size before, and I'm concerned about how we should handle caching. Has anyone dealt with caching on a PHP site of this size that could give me some pointers? I'm used to the CodeIgniter caching system and similar, but the number of cache files that would create worries me.

Any suggestions would be appreciated.

+2  A: 

I haven't done anything on that scale, but I don't see a problem with file-based caching as long as the caching mechanism isn't completely dumb, and you're using a modern filesystem. Distributing cache files throughout a directory tree is smart enough.

If you're worried, that's good. Of course, I would suggest writing a wrapper around CI's built-in mechanism, so that you can easily swap it out for something else (Like Zend_Cache, possibly with a beefy memcached server, or some smarter file-based system of your own design).

timdev
+1  A: 

There are several layers of caching available to PHP and CodeIgniter, but you shouldn't have to worry about the number of cached files on a standard linux server (various file systems can handle hundreds of millions of files per mount point). But to pick your caching method, you need to measure carefully.

Options:

  • Opcode caching (Zend, eAccelerator, and more)
  • CodeIgniter view caching (configured per view)
  • CodeIgniter read query caching
  • General web caching (more info)
  • Optimize your database (more info)

(and so on)

Additionally, you can improve the file caches by using memory file systems and in-memory tables.

The real question is, how do you pick caching strategies? Capacity planning. You model your system (users, accounts, pages, files), simulate, measure, and add caches based on best theories. Measure again. Produce new theories and measurements until you have approaches that fit your desired scale.

In my experience, view caching and web caching are a big gain for widely read sites (WPSuperCache, for example). Opcode caching (and other forms of min-imisation) are useful for heavily dynamic sites, as is database performance tuning.

Bruce Alderson
A: 

FYI: If the system runs on a Windows server: Windows can (could?) max. have approx. 65.000 files in a folder, including cache folders. Not sure if this upper limit has been fixed in newer versions.

Kristoffer Bohmann
A: 

All big guys use APC. The number of webpages is not relevant. The relevant number is the number of hits (pageviews ). And if you design for speed ditch the Windows machines.

Elzo Valugi