views:

465

answers:

6

I've seen this argument in a few places, and now, recently i saw it again on a reddit post. This is by no means a flame against any of these two languages. I am just puzzled why there is this bad reputation about python not being scalable.
I'm a python guy and now I'm getting started with Java and i just want to understand what makes Java so scalable and if the python setup that I have in mind is a good way to scale large python apps.

Now back to my idea of scaling a Python app. Let's say you code it using Django. Django runs its apps in fastcgi mode. So what if you have a front Nginx server and behind it as many other servers as needed that will each run your Django app in fastcgi mode. The front Nginx server will then load balance between your backend Djnago fastcgi running servers. Django also supports multiple databases so you could write to one master DB and then read from many slaves, again for load balancing. Throw a memcached server in to this mix and there you go you have scalability. Don't you?

Is this a viable setup? What does Java makes better? How do you scale a Java app?

+3  A: 

Without getting into a flamewar, consider how Python handles multi-threaded apps as compared to Java? For example, what global locks are in place in both languages that hurt concurrency (hint, Python's GIL - Global Interpreter Lock)?

Michael Goldshteyn
+15  A: 

Scalability is a very overloaded term these days. The comments probably refer to in-process vertical scalability.

Python has a global interpreter lock (GIL) that severely limits its ability to scale up to many threads. It releases it when calling native code (reacquiring it when the native returns), but this still requires careful design when trying to write scalable software in Python.

Marcelo Cantos
That said, some implementations like Stackless are better in the threading aspect -- much better. For example, Stackless Python is employed by the MMORPG Eve Online.
Tim Čas
But it should be noted that Stackless Python does not remove the GIL. It may make concurrent programming easier but it does not enable parallel execution. PyPy and Unladen Swallow both have removal of the GIL as one of their goals, but neither (as I recall) are there yet. IronPython and Jython are the only serious, currently GIL-less contenders as far as I'm aware.
James Cunningham
@Tim: Stackless is single-threaded. It simulates threads to allow highly concurrent behavior, as long everything is I/O-bound. But if you run a CPU-bound workload through Stackless on an 8-core system, You won't see more than about 12% utilization.
Marcelo Cantos
Yes, I thought it was - nevertheless, I think it adds plenty to the scalability. Just my opinion, you don't have to agree with it.
Tim Čas
+9  A: 

While I don't agree with the statement, I suppose they think Java is more scalable because it runs a lot faster. The JVM is very efficient (except perhaps in memory usage). Also Python's GIL (Global Interpreter Lock) doesn't allow "real" threading, while Java doesn't have a GIL and has true multithreading.

alpha123
What do you mean by "the JVM"? Its just a spec that might be implemented in very different ways, leading to very different performance results. Not trying to be a dick, really just curious.
darren
HotSpot, which is what most people use.
alpha123
But Python runs on the JVM, and consequently obviously also on HotSpot, so if scalability is all about HotSpot, and both languages run on HotSpot, then why exactly is one "more scalable" than the other?
Jörg W Mittag
Most people use CPython and not Jython. Jython does indeed eliminate all the scaling problems of Python (GIL etc).
alpha123
A: 

Hmmm - scalable could mean many things - scalable by distributed architecture, scalable by speed?

On the scalable by speed front, Java generally can process instructions faster than python - for the right kind of problem, much faster (I guess the main reason for that is that Java is compiled whereas Python is interpreted). From that point of view, Java can generally do more with less, and so is more scalable.

I'm referring my source experimental information back to two sources; http://mrpointy.wordpress.com/2007/11/06/java-vs-python-performance/ and http://blog.dhananjaynene.com/2008/07/performance-comparison-c-java-python-ruby-jython-jruby-groovy/

Brabster
On the number crunching front, you are technically correct. Except that, any serious number crunching in Python is done using the [numpy](http://numpy.scipy.org/) library, which is C/Fortran based and typically performs better than Java libraries.
Muhammad Alkarouri
Sorry, when I say number crunching, I'm not specifically referring to linear algebra and the like. I'll rephrase.
Brabster
@Brabster 1) Python is not interpreted, it is compiled. That's what the downvote is about. 2) most number crunching reduces to linear algebra "and the like."
aaronasterling
@aaronsterling "Python is an interpreted language, as opposed to a compiled one" - from http://docs.python.org/glossary.html - if docs.python.org is wrong, I give up... I also already rephrased "number crunching" to "processing instructions", which is what I actually meant, for better accuracy.
Brabster
@Brabster, It's common knowledge that they're using a weird definition on that boint. They view anything that isn't compiled to native code as interpreted. Java is interpreted by that view.
aaronasterling
+3  A: 

I think this article sums up many of the arguments about scaling and dynamic languages:

http://blogs.tedneward.com/2008/01/24/Can+Dynamic+Languages+Scale.aspx

It's worth noting its two definition for scaling...

  1. Size of project, as in lines-of-code (LOC)
  2. Capacity handling, as in "it needs to scale to 100,000 requests per second"

One often used argument about any dynamic language scaling is that as the code-base grows it becomes harder to refactor it without IDE support. Due to the lack of type information at compile time this support is often impossible to implement in dynamic languages.

Pablojim
I'm calling bullshit on this. Automatic Refactoring IDEs were *invented* in dynamic languages and the refactoring support in dynamic language IDEs such as VisualWorks and co. is still way ahead of anything I have seen for statically typed languages.
Jörg W Mittag
No need to curse! - I'm a big fan of dynamic languages but it is a fact that many types of refactoring are a lot harder to implement in dynamic languages. http://beust.com/weblog/2006/10/01/dynamic-language-refactoring-ide-pick-one/
Pablojim
A: 

I get mad when I see arguments like this. Not because I get all butthurt about haters harshing my Python mellow, but because to my mind, saying "X doesn't scale" is meaningless. It is necessary to specify a dimension, at the very least.

People are reluctant to do this, as it often reveals the fact that they don't have a good handle on the problem that they're speaking with confidence about. The global interpreter lock is a good touchstone here: threads are not the only way to perform concurrent operations.

Robert Rossney