views:

306

answers:

5

Hi,

I own a community website of about 12.000 users (write heavy), 100 concurrent users max on a single VPS with 1Gb ram. The load rarely goes above 3 and response is quite good.

Currently a simple file cache is used to store DB query results to ease the load on the DB, but the website still can slow down over 220 concurrent users (load test).

How can I find out what the bottleneck is?

I assume that DB is fine as cache is working fine, however Disk IO could cause problem. Each pageload has about 10 includes and 10-20 querys from DB or from the file cache, plus lots of php processing.

I tried using memcache instead of the file cache, but to my suprise the load test seemed to like file cache more.

I plan to use Alternative PHP Cache, but I still don't really understand how that cache is invalidated. I have a singe index.php that handles all requests. Will the cache store the result for each individual request? Will it clear the cache automatically if one of my includes (or query result from cache) change?

Any other suggestions for finding bottlenecks (tried xdebug)?

Thanks, Hamlet

A: 

MySQL has its own query cache.

You can enable it by setting query_cache_size to more than 0.

The query results are taken from the cache if the query is repeated verbatim and does not contain certain things like non-deterministic functions, session variables and some other things describe here:

The cache for a query is invalidated by issuing any DML operation against any of the underlying queries.

Quassnoi
+1  A: 

You mention you used XDebug - what weren't you able to do? Typically, to start tracking down a bottleneck you enable profiling of a request and then view the resulting "cachegrind" file in KCacheGrind or WinCacheGrind.

As for using a cache system, a dynamic script such as yours will generally do something like this

  • construct a cache "key" from the unique inputs to the script
  • ask the caching system if it has data for that key. If has, you're good to go!
  • otherwise, do all the hard work to generate the data, and ask the caching system to store it under the desired key for next time.

APC Cache can help to speed things up further by caching the parsed version of the PHP code.

Paul Dixon
+4  A: 

I plan to use Alternative PHP Cache, but I still don't really understand how that cache is invalidated. I have a singe index.php that handles all requests. Will the cache store the result for each individual request? Will it clear the cache automatically if one of my includes (or query result from cache) change?

APC doesn't cache output. It caches your compiled bytecode.

Essentially, a normal PHP request looks like this:

  1. PHP files are parsed and compiled to bytecode
  2. The PHP interpreter executes the bytecode

APC caches the result of the first step, so you aren't reparsing/recompiling the same code over and over again. By default, it still stat()s your PHP files on every request, to see if the file has been modified since its cached copy was compiled -- so any changes to your code will automatically invalidate the cached copy.

You can also use APC much like you'd use memcached, for storing arbitrary user data. Keep in mind, however:

  1. A memcached server can serve data to multiple servers; data cached in APC can only really be used locally. Better to serve a gig of data from one memcached box to four servers, than to have 4 copies of that gig of data in APC on each individual server.
  2. Memcached, in my experience, is better at handling large numbers of concurrent writes to a single cache key.
  3. APC doesn't seem to cope very well with its cache filling up. Fragmentation increases, and performance drops.

Also, beware: unless you've set up some sort of locking mechanism, your file-based cache is likely to become corrupt due to simultaneous writes. If you have implemented locking, that may become a bottleneck of its own. IMO, concurrency is tricky -- let memcached/APC/the database deal with it.

Frank Farmer
APC not only doesn't cope very well when it fills up but also doesn't handle very well big amounts of memory. It crashes unexpectedly and it also becomes impossible to delete keys.
Sergi
I have no personal experience with it (I'm only storing a few megabytes worth of keys in APC), but in their presentations, facebook claims to use multiple gigabytes of APC cache on every server -- in addition to their massive memcached install. They're using a custom build of APC though -- it's possible they've made some tweaks that aren't in the public version of APC (yet)
Frank Farmer
A: 

I turned on and configured APC on the test server and got a performance increase of about 400%

300 concurrent users with response time 1,4 secs max :) Good for a start.


Update:

Live server test results

Original:

No APC: 220 concurrent users, server load 20, response time 5000ms

No APC: 250 concurrent users, server load 20+, site is unavailable

New:

APC enabled: 250 concurrent users, server load 2, response time is 600ms

APC enabled: 350 concurrent users, server load 10, response time is 1500ms

APC enabled: 500 concurrent users, server load 20, response is 5000ms + site is fully operational, but a bit slow but can be used normally

Thanks for the suggestions, this is pretty great improvement.

Query cache is disabled as the site is write heavy thus cache would be invalidated constantly for whole tables.

hamlet
A: 

I would say that it's likely that your database is IO bound, I don't know exactly what a "VPS" is, but if it's some kind of VM, then there is almost guaranteed to be very poorly performing IO.

Get it on to real hardware ASAP; and get a sensible amount of ram (1G is tiny; 16G sounds more reasonable).

Then you may be able to tune your db so it can behave properly. How big are your data in total? If you can get all of them (or most of them) to fit in your database cache (not the dodgy query cache, the proper innodb buffer pool one), then do so.

I'm assuming you're using the innodb engine; if so, then set up the buffer pool to be big enough for all your data - if you don't have enough ram, buy more until you do (No, really!).

Then your db queries should be fast even if they're fairly bad (yes).

The tricky bit is, if you have a single machine, how to carve up ram usage between mysql and PHP - the web server (I assume Apache), particularly if you use prefork and lots of MaxClients, can use up loads of ram and deprive your database of it.

Get some decent monitoring on the job (with trending), and make changes carefully and record exactly when you made them.

MarkR
Ok... we need decent server with loads of RAM. My next question would be, how to earn money with a website? :)
hamlet
To run a web site, you need application developers, who generally require a salary. Unless you're hiring very cheap ones, this salary is likely to be a lot more than the cost of one reasonably capable server.It is not a useful use of limited funds to pay developers to try to work around a useless hosting platform.
MarkR