views:

45

answers:

2

I have a web service running written in PHP-MYSQL. The script involves fetching data from other websites like wikipedia,google etc. The average execution time for a script is 5 secs(Currently running on 1 server). Now I have been asked to scale the system to handle 60requests/second. Which of the approach should I follow. -Split functionality between servers (I create 1 server to fetch data from wikipedia, another to fetch from google etc and a main server.) -Split load between servers (I create one main server which round robin the request entirely to its child servers with each child processing one complete request. What about MYSQL database sharing between child servers here?)

+1  A: 

I'm not sure what you would really gain by splitting the functionality between servers (option #1). You can use Apache's mod_proxy_balancer to accomplish your second option. It has a few different algorithms to determine which server would be most likely to be able to handle the request.

http://httpd.apache.org/docs/2.1/mod/mod_proxy_balancer.html

Apache/PHP should be able to handle multiple requests concurrently by itself. You just need to make sure you have enough memory and configure Apache correctly.

Jimmy Z
+1  A: 

Your script is not a server it's acting as a client when it makes requests to other sites. The rest of the time its merely a component of your server.

Yes, running multiple clients (instances of your script - you don't need more hardware) concurrently will be much faster than running the sequentially, however if you need to fetch the data synchronously with the incoming request to your script, then coordinating the results of the seperate instances will be difficult - instead you might take a look at the curl_multi* functions which allow you to batch up several requests and run them concurrently from a single PHP thread.

Alternately, if you know in advance what the incoming request to your webservice will be, then you should think about implementing scheduling and caching of the fetches so they are already available when the request arrives.

symcbean