views:

219

answers:

1

Greetings to all the smart people around here!

I'd like to ask whether it is feasible or a good idea at all to deploy a Java enterprise web application to a Cloud such as Amazon EC2. More exactly, I'm looking for infrastructure options for an application that shall handle few hundred users with long but neither CPU nor memory intensive sessions. I'm considering dedicated servers, virtual private servers (VPSs) and EC2. I've noticed that there is a project called JBoss Cloud so people are working on enabling such a deployment, on the other hand it doesn't seem to be mature yet and I'm not sure that the cloud is ready for this kind of applications, which differs from the typical cloud-based applications like Twitter. Would you recommend to deploy it to the cloud? What are the pros and cons?

The application is a Java EE 5 web application whose main function is to enable users to compose their own customized Product by combining the available Parts. It uses stateless and stateful session beans and JPA for persistence of entities to a RDBMS and fetches information about Parts from the company's inventory system via a web service. Aside of external users it's used also by few internal ones, who are authenticated against the company's LDAP. The application should handle around 300-400 concurrent users building their product and should be reasonably scalable and available though these qualities are only of a medium importance at this stage.

I've proposed an architecture consisting of a firewall (FW) and load balancer supporting sticky sessions and https (in the Cloud this would be replaced with EC2's Elastic Load Balancing service and FW on the app. servers, in a physical architecture the load-balancer would be a HW), then two physical clustered application servers combined with web servers (so that if one fails, a user doesn't loose his/her long built product) and finally a database server. The DB server would need a slave backup instance that can replace the master instance if it fails. This should provide reasonable availability and fault tolerance and provide good scalability as long as a single RDBMS can keep with the load, which should be OK for quite a while because most of the operations are done in the memory using a stateful bean and only occasionally stored or retrieved from the DB and the amount of data is low too. A problematic part could be the dependency on the remote inventory system webservice but with good caching of its outputs in the application it should be OK too.

Unfortunately I've only vague idea of the system resources (memory size, number and speed of CPUs/cores) that such an "average Java EE application" for few hundred users needs. My rough and mostly unfounded estimate based on actual Amazon offerings is that 1.7GB and a single, 2-core "modern CPU" with speed around 2.5GHz (the High-CPU Medium Instance) should be sufficient for any of the two application servers (since we can handle higher load by provisioning more of them). Alternatively I would consider using the Large instance (64b, 7.5GB RAM, 2 cores at 1GHz)

So my question is whether such a deployment to the cloud is technically and financially feasible or whether dedicated/VPS servers would be a better option and whether there are some real-world experiences with something similar.

Thank you very much! /Jakub Holy

PS: I've found the JBoss EAP in a Cloud Case Study that shows that it is possible to deploy a real-world Java EE application to the EC2 cloud but unfortunately there're no details regarding topology, instance types, or anything :-(

+2  A: 

I'm serving a "few hundred users" from a single EC2 High-CPU Medium instance. No load balancing, no dedicated DB servers, nothing fancy at all. Simply a single box. Additionally I'm using some services:

  • Elastic Block Store for MySQL data, MySQL binlogs and Lucene indexes
  • S3 for resource and backup storage, obviously different baskets for each
  • SimpleDB Metadata for resources
  • CloudFront for resources - mainly because we can :)
  • Simple Queue Service for messaging (used to queue some background tasks)

As I said, nothing fancy - at least in Amazon's cloud environment. And everything for less than 200$/month. Regarding pricing, you should take care though. Amazon did a good job at obfuscating main costs. For example, looking at CloudFront Pricing, you might look at 0,15$ per GB but ignore 0,01$ per 10,000 - it's a ridiculously small price for a lot of requests, isn't it? Big surprise: 2/3 of our CloudFront cost is for requests (about 3 KB per request). I/O requests for EBS is a similar story.

As it would be extremely easy to scale (use a bigger instance, move DB on Relational Database Service) I'd suggest you start with the same setup. As you said, throwing more boxes in is pretty simple (assuming your setup supports adding/removing nodes on the fly). This makes choosing the appropriate setup by trial and error easily feasible - some thorough load testing should do the job. Choose something that works for your expected load (plus some extra power) and grow/shrink as soon as you have production data.

As a conclusion: yes, it's certainly possible to host JEE apps on EC2 :)

Edit: as a side note: comparing pricing of EC2 with traditional hosting is comparing apples and oranges - at least as long as you don't get an SLA for your network, nearly unlimited scalability, no hardware issues, nearly unlimited and redundant storage, different availability zones and a bunch of extra services with it. If somebody tells you that traditional hosting is cheaper, he might be a sysadmin anxious about his job ;) Don't get me wrong, it is cheaper - but you get much less for a little less money.

And by the way, I'm in no way affiliated with Amazon ... but I feel that I should be rewarded for being a good spokesman, shouldn't I? :D

sfussenegger
Thank you very much for your valuable insights and the side note on pricing as well! You helped me a lot.
Jakub Holý
@Jakub I'm glad that I could help. Another thing though: As soon as you start using Amazon's cloud services, it's extremely tempting to throw in EC2, S3 and more in many situations - just because you can. So that another risk that's very likely to start driving costs ;)
sfussenegger
And one more thing: If you have services that don't require 100% uptime, you might benefit from spot instances. You typically pay similar hourly fees as for reserved instances but without an initial payment - you should be able to handle the (mainly theoretical) risk of downtimes though. It's very well suited for background processing (e.g. number crunching) where tasks are submitted to an SQS queue. We publish results to S3 and notify subscribers using XMPP PubSub (we're thinking about moving to SNS though). Really cool stuff :)
sfussenegger
@sfussenegger Regarding the temptation to use more services, I'd expect that - this is exactly what Amazon wants :-) And as you pointed out, they're good at hiding costs so one must be careful.
Jakub Holý