views:

178

answers:

2

I have a PHP program that has been written keeping in mind a single server, so there are inherent limitation to how much it can handle. For example, the developer says that his current webhosting service provides him with "50 MySQL connections" which he interprets as that only 50 people can be simultaneously logged onto it.

What do we need to do if we want to scale it up so it can handle a load of 500 or more? How can we adapt this program to a "load balancer" with minimal changes?

The application is writen in PHP and uses MySQL.

A: 

A MySQL connection only counts as in use when it is open. For example if I connect to a webserver to get a page it'll only take 50ms or so (guess) to actually create the page and send it to me. That means it can serve 20 pages in 1 second with 1 MySQL connection. What the user does with the page after its sent to their browser doesn't affect your servers performance.

So with 50 you'd have to be getting around 1000 hits per second to cause issues.

If it does become a problem you can always close the MySQL connections earlier (rather than letting them terminate when the script does).

To be perfectly honest I don't think it is the problem you think it is - if you get enough traffic for it to truly become an issue then that is what we call 'a good problem to have' :)

DCD
+4  A: 

Regardless of any restrictions set up by a host, you should sensibly limit your queries whenever possible. This goes if you are on shared hosting, or own 10 racks full of servers that are entirely at your disposal.

The fewer queries needed to render a page:

  • The faster database connections are closed, allowing the RDBMS to release memory
  • The faster connection resources are closed, allowing your application to release memory
  • The faster HTTP server processes exit, allowing them to release memory
  • The faster the user gets the information they were looking for

A typical shared web host will (as you note) have a single server mentality. Running a RDBMS on the same computer as a web server is almost never a good idea if you want to scale. Why? Both have to allocate way more memory than they actually need or use in order to be able to deal with requests and return the data that is asked for. This is especially true for any RDBMS that supports type affinity.

Also, take a look at how long the queries that you actually need are taking to return. The faster they finish, the faster resources are released (and pretty much everything else in the list above).

This means, the less time your app spends connected to the database, the less likely you will be to hit a connection limit. 50 can serve 500, or more users. Be that limiting queries, optimizing them or both.

Take a good look at your app. Where can you implement caching for information that is not likely to change on every page load? How can you make better use of sessions? Is that groovy ajax interface making a query for EVERY event?

Most people already ensure this is not the case, so questions like this would fall into the micro optimization category. Its really a fundamental design concept.

Design it to scale and work around such constraints, and you can usually avoid the constraints until time and money permits addressing them.

Also, a side note, a VPS where you control everything is almost as cheap as a typical re-seller hosting account. Why not build your own sandbox and play by your own rules?

As for load balancing, first decide on a replication scheme. You then decide on how best to distribute the work. In some instances, you can read from one slave and write to another, in other cases you need to employ some kind of reverse proxy, be it hardware or software. Your question is a little too generic to offer a more comprehensive answer in that regard.

Tim Post
I can't agree with "limit your queries whenever possible" axiom. This number mist be just sensible, not lowered at any cost. It usually produce odd questions "how to get completely different info in one query". Most of time number of queries is matter of proper architect, not performance. And performance tuning **is not** lowering number of queries. But *profiling* based query/database optimisation is.
Col. Shrapnel
@Col. Shrapnel: By saying `whenever possible`, I assume `sensibility` goes into whatever process determines `possible`. I will edit for clarity, however.
Tim Post
Well it's still doesn't answer load balancing nor real performance optimisation question
Col. Shrapnel
@Col. Shrapnel: You, my overly blunt _pen pal_, are free to expand your answer as well :)
Tim Post
@Tim - thanks Tim for the explanatory answer. Do you have any insights into how I can "stress test" this simple application to see at how many concurrnt users it breaks?
Dave
@Dave - You can use tools like apache's `ab` to start. I don't know of a one size fits all stress testing tool. What's most important is query profiling, especially under load. A query that takes 3ms with 10 users might take longer with 100 users. I'd also not recommend doing that on a shared host, you'll just annoy them and get bad data.
Tim Post
@Tim - wonderful! Thanks for guiding!
Dave