views:

272

answers:

9

if you needed to build a highly scalable web application using java, what framework would you use and why?

I'm just reading thinking-in-java, head first servlets and manning's spring framework boo, but really I want to focus on highly scalable architectures etc.

would you use tomcat, hibernate, ehcache?

(just assume you have to design for scale, not looking for the 'worry about it when you get traffic type responses)

+3  A: 

I would check out Apache Mina. From the home page:

Apache MINA is a network application framework which helps users develop high performance and high scalability network applications easily. It provides an abstract · event-driven · asynchronous API over various transports such as TCP/IP and UDP/IP via Java NIO.

It has an HTTP engine AsyncWeb built on top of it.

A less radical suggestion (!) is Jetty - a servlet container geared towards performance and a small footprint.

Brian Agnew
+1  A: 

All popular modern frameworks (and "stacks") are well-written and don't pose any threat to performance and scaling, if used correctly. So focus on what stack will be best for your requirements, rather than starting with the scalability upfront.

If you have a particular requirement, then you can ask a question about it and get recommendations about what's best for handling it.

Bozho
Sorry but that's vague "everyone is special" type feel-good nonsense and not helpful to anyone.
cletus
but I am feeling warm and fuzzy, but that doesnt' scale well lol
mrblah
cletus, I agree, but I don't think you can say anything more concrete to a question like "I want to make something scalable". And that "nonsense" might be a good orientation.
Bozho
+4  A: 

The answer depends on what we mean by "scalable". A lot depends on your application, not on the framework you choose to implement it with.

No matter what framework you choose, the fact is that the hardware you deploy it on will have an upper limit on the number of simultaneous requests it'll be able to handle. If you want to handle more traffic, you'll have to throw more hardware at it and include load balancing, etc.

The part that's pertinent in that case has to do with shared state. If you have a lot of shared state, you'll have to make sure that it's thread safe, "sticky" when it needs to be, replicated throughout a cluster, etc. All that has to do with the app server you deploy it to and the way you design your app, not the framework.

Tomcat's not a "framework", it's a servlet/JSP engine. It's got clustering capabilities, but so do most other Java EE app servers. You can use Tomcat if you've already chosen Spring, because it implies that you don't have EJBs. Jetty, Resin, WebLogic, JBOSS, Glassfish - any of them will do.

Spring is a good choice if you already know it well. I think following the Spring idiom will make it more likely that your app is layered and architecturally sound, but that's not the deciding factor when it comes to scalability.

Hibernate will make your development life easier, but the scalability of your database depends a great deal on the schema, indexes, etc. Hibernate isn't a guarantee.

"Scalable" is one of those catch-all terms (like "lightweight") that is easy to toss off but encompasses many considerations. I'm not sure that a simple choice of framework will solve the issue once and for all.

duffymo
+1  A: 

Frameworks are more geared towards speeding up development, not performance. There will be some overhead with any framework because of use cases it handles that you don't need. Granted, the overhead my be low, and most frameworks will point you towards patterns that have been proven to scale, but those patterns can be used without the framework as well.

So I would design your architecture assuming 'bare metal', i.e. pure servlets (yes, you could go even lower level, but I'm assuming you don't want to write your own http socket layer), straight JDBC, etc. Then go back and figure out which frameworks best fit your architecture, speed up your development, and don't add too much overhead. Tomcat versus other containers, Hibernate versus other ORMs, Struts versus other web frameworks - none of that matters if you make the wrong decisions about the key performance bottlenecks.

However, a better approach might be to choose a framework that optimizes for development time and then find the bottlenecks and address those as they occur. Otherwise, you could spin your wheels optimizing prematurely for cases that never occur. But that probably falls in the category of 'worry about it when you get traffic'.

Brian Deterling
+2  A: 

The two keywords I would mainly focus on are Asynchronous and Stateless. Or at least "as stateless as possible: Of course you need state but maybe, instead of going for a full fledged RDBMS, have a look at document centered datastores.

Have a look at AKKA concerning async and CouchDB or MongoDB as datastores...

raoulsson
Great answer. Asynch and stateless are absolutely the key. To look at it from the other side, managing state is a total PITA and managing synchronous connections is too
Fortyrunner
+1  A: 

If you are able to work with a commercial system, then I'd suggest taking a look at Jazz Foundation at http://jazz.net. It's the base for IBM Rational's new products. The project is led by the guys that developed Eclipse within IBM before it was open-sourced. It has pluggable DB layer as well as supporting multiple App Servers. It's designed to handle clustering and multi-site type deployments. It has nice capabilities like OAuth support and License management.

James Branigan
Anything but a bloated product from IBM. Rational has been a failure since before IBM bought it.
duffymo
@duffymo Have you evaluated any of the Jazz based products, or are making your statements based on the previous generation of products? If you haven't looked at the Jazz based stuff, you really should. It might just change your opinion.
James Branigan
+1  A: 

There is no framework that is magically going to make your web service scalable.

The key to scalability is replicating the functionality that is (or would otherwise be) a bottleneck. If you are serious about making your service, you need to start with a good understanding of the characteristics of your application, and hence an idea of where the bottlenecks are likely to be:

  • Is it a read-only service or do user requests cause primary data to change?
  • Do you have / need sessions, or is the system RESTful?
  • Are the requests normal HTTP requests with HTML responses, or are you doing AJAX or callbacks or something.
  • Are user requests computation intensive, I/O intensive, rendering intensive?
  • How big/complicated is your backend database?
  • What are the availability requirements?

Then you need to decide how scalable you want it to be. Do you need to support hundreds, thousands, millions of simultaneous users? (Different degrees of scalability require different architectures, and different implementation approaches.)

Once you have figured these things out, then you decide whether there is an existing framework that can cope with the level traffic that you need to support. If not, you need to design your own system architecture to be scalable in the problem areas.

Stephen C
A: 

As others already have replied scalability isn't about what framework you use. Sure it is nice to squeese out as much performance as possible from each node, but what you ideally want is that by adding another node you scale your app in a linear fashion.

The application should be architectured in distinct layers so it is possible to add more power to different layers of the application without a rewrite and also to add different layered caching. Caching is key to achive speed.

One example of layers for a big webapp:

  • Loadbalancers (TCP level)
  • Caching reverse proxies
  • CDN for static content
  • Front end webservers
  • Appservers (buisness logic of the app)
  • Persistant storage (RDBMS, key/value, document)
NA
A: 

In addition to the above:

Take a good look at JMS (Java Message Service). This is a much under rated technology. There are vendor solutions such as TibCo EMS, Oracle etc. But there are also free stacks such as Active MQ.

JMS will allow you to build synch and asynch solutions using queues. You can choose to have persistent or non-persistent queues.

Fortyrunner