views:

298

answers:

3

I have to build a web application that's similar to a "branch/store locater". A user types in his address and the web application will plot nearby stores on a map.

One of the requirements is:

"The web application must support 100 simultaneous users and up to 5GB/day transfer volume."

Much of the transferred data will be text and GUI images.

So my questions are:

  1. Is this considered a high traffic application?
  2. What web application/site can I look to for comparable traffic?
  3. Do I need to implement things like memcached, template caching, load server balancing etc...?

I've worked on high traffic applications before, but I was never the architect. So although I'm aware of some (not all) strategies for managing high-traffic scenarios, I'm not familiar with their actual implementation.

Can someone offer me advice, feedback, or suggested research? Did I overlook anything?

**Also, I'm building this with LAMP with Smarty.

+1  A: 

Before diving into the hardcore server side stuff (load balancing and memcached) make sure you understand and implement all (or most) rules from YSlow: http://developer.yahoo.com/yslow/help/

Then, if MySQL is a bottleneck, get a copy of High Performance MySQL or simply read on how to tune your queries/design at www.mysqlperformanceblog.com.

5GB per day is not that much.

cherouvim
+1  A: 

On a 100Mbps link you can transfer up to

100 * 60 * 60 * 24 / 1024 / 8 = 1054 GB per day

5GB/day represents roughly 0.5% of that so I don't think you have to care about the traffic at such scales, as there are so much things likely to be your bottleneck before that (JavaScript, database access, etc...). Furthermore, once you know you have enough bandwidth available, you should not care about such things before writing (and benchmarking) your app, as this may lead to premature optimization.

I found the scaling section in the Django book interesting as a general piece of knowledge in this field.

Luper Rouch
JavaScript as a server-side bottleneck?
Calvin
In terms of user experience there is no client/server side distinction. Isn't user experience the metric that should drive optimizations in a web application ? (who cares about your blazingly fast load balanced server if a piece of javascript doubles your pages load time) That is what I meant, maybe the term bottleneck was unappropriated.
Luper Rouch
How is that specific to high volume websites? This question is about design considerations for handling high volumes of traffic. JavaScript or any other client-side code are not effected by traffic volume.
Calvin
This question is about design considerations for handling 5GB/day.
Luper Rouch
Right, so how does JavaScript relate to traffic volume?
Calvin
It doesn't. The question doesn't relate to traffic volume either as everyone here seems to agree, unless the site is hosted on a 56K line :)
Luper Rouch
You never know... =p
Calvin
+1  A: 

That's only 60kb per second (assuming running over 24hr), but you'll likely hit bursts during peak hours, so you'll need to be able to handle that. 100 simultaneous users is nothing even for an older Apache-based server.

I'm not sure memcached would really help you out so much, but its worth adding, as is APC for your PHP cahcing, and I would at least architect it to be able to be load balanced - check out ultramonkey for some good documentation on how to get that going transparently, you'll need to ensure that any user session that comes in doesn't store its session data in a per-host way; you need to consider what happens is a user hits lead-balanced server A in one call, and then hits server B in another. (ie store user id and data in the DB, not in the filesystem).

gbjbaanb

related questions