views:

271

answers:

3

Hi

I am writing a Facebook application that would use a Postgres DB along with Facebook APIs and run on Amazon EC2. (and I am hoping for heavy loads )

With Java, I know that DB would be my primary bottleneck and concurrency limitations of Tomcat would be the secondary bottleneck. I could alleviate DB issues with caching and concurrency issues with horizontal scaling . ( but this would add to my EC2 costs).

How would Erlang or Haskell help in this situation ? ( assuming I am able to master the learning curve).

A: 

There is no way that choosing a different language is going to significantly speed up an application over a compiled language such as Java when the bottlenecks are already defined as not being code related. Most functional languages have more overhead than Java and so there has to be a compelling reason to switch to one if you are already familiar with Java.

Payton Byrd
Java is also run in a VM (so while it's compiled, it's not native), so I don't see how you can argue that functional languages inherently have more overhead. Also, Erlang is specifically designed to handle multiple 'threads' (it began as a language to develop software for switches), and so it is entirely possible that the inherent multi-threading capabilities of Erlang could increase performance, depending on the specific application. While profiling would certainly be a first step towards optimizing, saying that Erlang flat-out can't help seems premature.
kyoryu
Zak
@kyoryu Modern versions of the JVM do a damn fine job of doing a JIT compile to native code for the machine it is running on. Benchmarks show that newer JVMs are much, much faster than their predecessors and compare favorably to C++ for many use cases. The same holds true for the .Net framework. Those of you marking down my answer should probably do a bit of research.
Payton Byrd
@Payton Byrd : I never said Java was slow. i said it was not compiled to native code. In fact, I regularly defend various VM languages (Java, C#) on a perf basis against native code for those various reasons. But claiming that functional languages are slower due to 'higher overhead' while ignoring the advantages that they may have (concurrency for Erlang, lazy eval for Haskell) is making the *exact same error* as those arguing that Java and C# must always be slower than C++. Erlang, in fact, abandoned attempts at compiling to native code because it was *no faster* than their VM...
kyoryu
... The markdown was not due to any erroneous belief that Java must be slow, but rather to your assertion that Erlang/Haskell must inherently be slower, apparently without doing research. While it may be true that there'd be little gain in this case, a flat assertion that Java will outperform any functional language seems rather incorrect. In the same way, a flat-out assertion that Erlang or Haskell would be faster would be *equally incorrect*. Use the right tool for the right job, and there's little evidence given that one tool is better than another.
kyoryu
Jim Ferrans
+9  A: 

Two semi-answers:

Do you have users yet? No? Then use whatever will help you get the project off the ground quicker. You can always rewrite things later if you have to. "Too many users" is a problem that most people would like to have, but don't. If you have real reason to expect a large user base quickly (e.g., you run a popular blog and expect many of your readers to join immediately) it's justified to worry about this, otherwise you're borrowing trouble.

Are you sure you know where the bottlenecks will be? Scaling out like that raises concerns very different from performance in a smaller application. Make sure you really know what's broken before you start preemptively fixing things. The architecture of your application will probably be more important that what you build it with, anyway.

That said, either Erlang or Haskell would work if you want to do it that way, but probably won't make a huge difference for what you're asking. There's plenty of other reasons to prefer them to Java, though...

camccann
Stackoverflow's failure to recognize that this is the superior answer raises doubts about our robot overlords.
Will
+4  A: 

I'd take a look at http://www.highscalability.com and look at case studies of how to go about scaling your application to larger and larger loads. In particular search there for Brad Fitzpatrick's description of how he scaled LiveJournal and Danga Interactive (eg, this 2005 presentation).

Your intuition about the database being the first bottleneck and then the web server is probably correct, but of course you need to measure.

The major ways to scale your site will involve clustering and caching and database sharding and so on. The choice of programming language is secondary, and generally affects the raw performance on each box. See Henderson's Building Scalable Web Sites and Schlossnagle's Scalable Internet Architectures for other ideas and background in this area.

Having said that, a functional language may help to improve your overall scalability. Twitter used Scala to improve back end performance. Scala is a JVM language that combines object-oriented and functional styles, supports the Actors concurrency model, and runs at nearly the speed of Java (Martin Odersky, the creator of Scala, also wrote the current Sun Java compiler). So if you should run into a concurrency bottleneck you might want to sprinkle a bit of Scala in with your Java.

Jim Ferrans
I appreciate the suggestion about Scala ; I was concerned about its stability and had eliminated it from the short list. But Java + Scala seems like a winning combination.I have been working building the Front end GUI using GWT ( JavaScript) using JSON to communicate with the back-end.I chose GWT+ JSON specifically to eliminate any hard dependency on a specific backen technology.I am thinking porting over my Java JSON servlet to Scala would be easier that writing it in Haskell.
@user193116: Scala seems pretty stable bug-wise, but it's still not quite finished evolving: the new Scala 2.8 collections implementation has a few small incompatibilities, and upgrading to 2.8 requires recompilation (http://www.scala-lang.org/node/2060).
Jim Ferrans