views:

96

answers:

4

I know how to create small data driven websites but want to get an idea on how to convert them to handle large data flow.

The questions are based on a site that would act mostly like stack overflow, craigslist etc where people could post stuff and others reply and would have basic search capabilities based on tags.

  1. Are regular relational databases like SQL server, Oracle etc strong enough to support a lot of data read and writes?

  2. If I have a site hosted on a dedicated single server, how much traffic in general can I expect it to handle?

  3. Are there any general design rules or problems that need to be taken into account when creating mid to large level applications?

+2  A: 
  1. Yes, but write your queries wisely and make use of caching.
  2. Depends on the hardware, OS and webserver.
  3. Check out 3-tier architecture.
Gert G
+3  A: 
  1. With a good caching strategy and well-written SQL statements, any RDBMS should be sufficient.

  2. Short answer is it depends. There's an good discussion on this very topic here.

  3. I would suggest you start by reviewing this post. Just following basic coding practices will help make your code more scalable.

BenV
+1 for the 'this post'. Thanks!
sarnold
A: 

Well two other people already beat me to the caching and SQL query advise. Other thing I would recommend is to use AJAX and client side validation to reduce the amount of full page loads and server postbacks.

antonlavey
+1  A: 

Re #2: Use Siege or any relevant web benchmarking tool - Apache ab, perfmon & shell scripts, whatever can hammer the heck out of the server and report on it (Siege acts a bit more like real users would, really recommend it). You'll be able to get some real metrics of what your server can handle before it's drowning in the real thing: requests per second, concurrent users, response times, bandwidth usage, etc.

Granted that won't help a whole lot when you're only at the design phase. In that case, install a handful of OSS web apps with similar concepts and hammer them first. It'll only be a rough estimate since there are so many variables, but still better than pulling numbers out of the air.

tadamson