tags:

views:

85

answers:

5

Hi all,

I am curious as to how do everyone approach the question of the following nature:

'we need to create Application A (eg. ecommerce site). It will use technology B (eg. java). It has to support C (eg. 200) number of concurrent users. What kind of hardware do we need?'

Hardware specification would involve the number of CPU required, and perhaps amount of memory for a basic answer.

To simplify my example, I will stick with Java technology in my question, but I really would like a technology-neutral advice.

I understand that such a question involves many additional factors. Eg, a different framework (Wicket vs Struts vs Spring vs pure EJB J2EE architecture), number of distributed tiers (one box setup or 3 tier setup).

But given that a person might have no prior experience with the given technology (or maybe no opportunity to do a load test of finding out what the required hardware is), and such a question always comes up during an initial project discussion (and an answer is essential as a baseline to move forward), how do you go about giving an answer?

I had thought about solving the memory required problem by estimating the amount of memory each user session might take, but there will definitely be framework/virtual machine overhead.

But in general, I just cannot seem to reason out a good solution to this question, which always seem to pop up. A load test would definitely help solve it, but then by that point the project is already ready, and this is a question a client would usually want an answer to before committing to a project.

Do hope that the community can advise on good approaches to this.

Thanks.

+1  A: 

You really have to make some assumptions about your user experience to make an initial non-load-tested estimate of concurrent users. Start with some assumptions about a users session. Decent ballpark assumptions could be (this may not hold depending on the complexity or simplicity of your site):

  • Each user hits a new page or resource every 5 seconds (assuming AJAXy)
  • Each request takes an average of 200ms to process.
  • It's generally good to have an average usage that is 25% of capacity to allow for spikes, even more for social network sites where the spikes could be larger.

Then you would say:

200 users ( 200ms / 5s ) => 8 CPUs average need * 4 (for 25% load) => 32 CPUs.

I don't think this is particularly language centric. Memory isn't expensive, have enough.

stevedbrown
+1  A: 

Personally, I think anyone who tells you there's a general solution to this problem is lying to you. That's especially true if they're a hardware vendor.

You can only make estimates like this based on similar loads to the load you expect. You will later learn that you were mistaken because the real load really isn't that similar to the "similar" load you used as an estimate.

Hopefully, you'll learn from that and do a better estimate next time.

John Saunders
+1  A: 

Honestly, I've seen no good way to get a reasonable estimate short of doing a sample load test. There are simply too many variables in an application of any appreciable size: hardware (memory architecture, number of CPUs, disk architecture), software (implementation details, operating system, virtual machine [if any], database system, etc etc), environmental (network, cooling) and others.

This is basically a special application of performance testing. The wisest people on the subject have stated repeatedly and clearly that you need to get numbers. So the best advice I can give is put in the plan the need for working prototypes at the earliest possible stage so that you can get those numbers and plan for those numbers to change over time so get prototypes or working versions at every checkpoint to retest all along the project timeline.

Your early estimates will probably bear little resemblance to your final numbers, but at least you'll be able to correct as needed at critical junctures instead of getting to the end to realize there's no time left in the schedule for fixes.

Lee
A: 

You are asking how to quantitatively sized your hardware requirements.

I, however, would approach it from another perspective. Design your app so that it can scale out (for any tiers). From there, deploy iteration #1 of your hardware. Run some public beta on that. Collect some real life numbers. And reconfigure the system to meet a higher performance metric. Repeat.

Jacques René Mesrine
A: 

The hardware required is governed more by the complexity and quality of the software far more than it's governed by the number of users or transaction load. The only people who claim otherwise are the hardware vendors.

skaffman