views:

86

answers:

2

I wish to set up a CPU-intensive time-important query service for users on the internet. A usage scenario is described below. Is cloud computing the right way to go for such an implementation? If so, what cloud vendor(s) cater to this type of application?

I ask specifically, in terms of: 1) pricing 2) latency resulting from: - slow CPUs, instance creations, JIT compiles, etc.. - internal management and communication of processes inside the cloud (e.g. a queuing process and a calculation process) - communication between cloud and end user 3) ease of deployment

A usage scenario I am expecting is: - A typical user sends a query (XML of size around 1K) once every 30 seconds on average. - Each query requires a numerical computation of average time 0.2 sec and max time 1 sec on a 1 GHz Pentium. The computation requires no data other than the query itself and is performed by the same piece of code each time. - The delay a user experiences between sending a query and receiving a response should be on average no more than 2 seconds and in general no more than 5 seconds. - A background save to a DB of the response should occur (not time critical) - There can be up to 30000 simultaneous users - i.e., on average 1000 queries a second, each requiring an average 0.2 sec calculation, so that would necessitate around 200 CPUs.

Currently I'm look at GAE Java (for quicker deployment and less IT hassle) and EC2 (Speed and price optimization) as options. Where can I learn more about the right way to set ups such a system? past threads, different blogs, books, etc.. BTW, if my terminology is wrong or confusing, please let me know.

I'd greatly appreciate any help.

A: 

I was given a talk on the Amazon EC2 platform, and one of the things the chap was talking about was instance triggers. Say you have a single instance of your server running, you might want to add another instance if the server starts to deliver low response times, well EC2 allows you to automatically create new instances based on triggers, this means your cloud computing will scale depending on demand, and that in turn should reduce overall costs.

These triggers can be based on a number of metrics, such as CPU load etc, you could set a CPU load limit of 70%, if it exceeds that spawn another instance.

It is definitely worth a look, a trusted platform by many

I've been trying to find the slides, with little luck. The chap who gave the talk was Simone Bruzzoni and he is an AWS evangelist, he's on Twitter too and usually replies to peoples tweets.

ILMV
Thanks. The trigger mechanism is interesting, I suppose there's also a mechanism for killing instances with low load. However, I was asking in general whether this type of application would benefit from cloud architecture, and if so, what are the considerations of using e.g. an IaaS vendor like Amazon vs. PaaS vendor like GAE in respect to the time-important cpu-intensive nature of the task.
Eric
A: 

I'm not sure that your application is a good fit for GAE, due to the fact that it places a very low limit on the amount of time that each request is allowed to take. If a request goes above that threshold, it is terminated automatically. I think that the intent here is to keep GAE running optimally for all users, but it's something that you need to consider when designing your application.

For your particular case, I'd probably suggest AWS or Rackspace's cloud offerings.

Petey
The limit is 30 seconds, you should be OK. I'm more worried about the daily CPU quotas.
Ranieri