tags:

views:

761

answers:

5

I heard from various sources that J2EE is highly scalable, but to me it seems that you could never scale a J2EE application to the level of the google search engine or any other large website.

I would like to hear the technical reasons why it is so scalable.

+11  A: 

J2EE is considered scalable because if you consider the EJB architecture and run on an appropriate application server, it includes facilities to transparently cluster and allow the use of multiple instances of the EJB to serve requests.

If you managed things manually in plain-old-java, you would have to figure out all of this yourself, for example by opening ports, synchronizing states, etc.

I am not sure you could define Google as a "large website". That would be like likening the internet to your office LAN. J2EE was not meant to scale to the global level, which is why sites like Amazon and Google use their own technologies (e.g., with use of MapReduce).

There are many papers discussing the efficiency of J2EE scalability. For example this

Uri
+2  A: 

One could look at a scalable architecture from the point of view of what the base framework (like j2ee) provides. But that's just the beginning.

Designing for a scalable infrastructure is an architectural art. It's like the art of projection ... how will it behave when it's blown up real big. The base questions are:

  • Where do I keep commonly accessed stuff so that when so many persons are asking for it, I don't have to go for it so many time (cache)?
  • Where do I keep each individual's stuff so that when there are so many individuals needing stuff kept, I won't have trouble managing them all.
  • How do I remember what a person did here the last time they came here, since they may not be coming back to the same particular node they visited the last time.
  • How long will I have to wait for (block on) a long-running procedure if so many persons are requesting it?

... that sort of thing is beyond what a framework can wrap. In other words, the framework could be scalable but the product is wired too tight to scale.

J2EE, as a framework is quite scalable, like most modern microprocessor-targeting enterprise frameworks. But I have seen amazing (not in a good way) stuff build out of even the best of them.

For a plethora of references, please search Google for "Designing for Scalability"

Pita.O
A: 

The "scalability" thing talks about "what will you do when your application doesn't fit in a single computer anymore?".

Scalable applications can grow over more computers than one.

Note that large servers can have VERY large applications with lots of memory and lots of cpu's - see http://www.sun.com/servers/highend/m9000/ or http://www-03.ibm.com/systems/i/hardware/595/index.html - but it is usually more expensive than having lots of small servers with the application spreading over them.

Thorbjørn Ravn Andersen
Actually mainframes are cheaper than PC server clusters in most respects when doing a straight processing-over-cost calculation.
JUST MY correct OPINION
Depends. What scenario do you have in mind?
Thorbjørn Ravn Andersen
A: 

I have 12 servers in my bedroom running a web app. The servers are low cost junk I bought anywhere from 200-400$. I use php (sessionless by nature), pgpool and slony for DB replication and apache servers scaled across multiple vm's. i can only afford software load balancing, but i do have a few cisco routers that allow me to connect my web site to dsl and cable. the whole setup cost me around 5000 grand and its fast as hell. i did performance testing with loadrunner, and i was able to simulate ~18,000 users concurrent users all doing stuff. When (if) I need to expand, I run to the local used computer shop, and add pc's, configure and I'm done. Oh, and the computers have no cases, they are all wired as motherboards and hardisks in used mainframe cabs. So you see, you really don't need deep pockets to build a scaleable, fast, reliable website. Hardware is cheap. I'll be damned if I ever invest all my time learning the new JEE crap when I can just use PHP and a pile of cheap used hardware to compete with the bigboys.

Johhny
Yeah, I'm sure the big boys are just quaking in their boots when they hear about your twelve-server scalability.
JUST MY correct OPINION
Fast, perhaps, but what is your uptime if power goes away?
Thorbjørn Ravn Andersen
+1  A: 

What makes Java EE scalable is what makes anything scalable: separation of concerns. As your processing or IO needs increase, you can add new hardware and redistribute the load semi-transparently (mostly transparent to the app, obviously less so to the configuration monkeys) because the separated, isolated concerns don't know or care if they're on the same physical hardware or on different processors in a cluster.

You can make scalable applications in any language or execution platform. (Yes, even COBOL on ancient System 370 mainframes.) What application frameworks like Java EE (and others, naturally -- Java EE is hardly unique in this regard!) give you is the ability to easily (relatively speaking) do this by doing much of the heavy lifting for you.

When my web app uses, say, an EJB to perform some business logic, that EJB may be on the same CPU core, on a different core in the same CPU, on a different CPU entirely or, in extreme cases, perhaps even across the planet. I don't know and, for the most part, provided the performance is there, I don't care. Similarly when I send a message out on the message bus to get handled, I don't know nor do I care where that message goes, which component does the processing and where that processing takes place, again as long as the performance falls within my needs. That's all for the configuration monkeys to work out. The technology permits this and the tools are in place to assess what pieces have to go where to get acceptable performance as the system scales up in size.

Now when I try and hand roll all of this, I start with the problems right away. If I don't think about all the proxying and scheduling and distribution and such in advance, when my app expands beyond the bounds of a single machine's handling I now have major rewrites in place as I shift some of the application to another box. And then each time my capacities grow I have to do this again and again.

If I do think about all of this in advance, I'm writing a whole lot of boilerplate code for each application that does minor variations of all the same things. I can code things in a scalable way, but do I want to do this every. damned. time. I write an app?

So what Java EE (and other frameworks) bring to the table is pre-written boilerplate for the common requirements of making scalable applications. Writing my apps to these doesn't guarantee they'd be scalable, of course, but the frameworks make writing said scalable apps a whole lot easier.

JUST MY correct OPINION