views:

4425

answers:

3

I'm hoping someone can provide some insight as to what's fundamentally different about the Java Virtual Machine that allows it to implement threads nicely without the need for a Global Interpreter Lock (GIL), while Python necessitates such an evil.

+113  A: 

Python (the language) doesn't need a GIL (which is why it can perfectly be implemented on JVM [Jython] and .NET [IronPython], and those implementations multithread freely). CPython (the popular implementation) has always used a GIL for ease of coding (esp. the coding of the garbage collection mechanisms) and of integration of non-thread-safe C-coded libraries (there used to be a ton of those around;-).

The Unladen Swallow project, among other ambitious goals, does plan a GIL-free virtual machine for Python -- to quote that site, "In addition, we intend to remove the GIL and fix the state of multithreading in Python. We believe this is possible through the implementation of a more sophisticated GC system, something like IBM's Recycler (Bacon et al, 2001)."

Alex Martelli
That makes sense ... thanks Alex.
AgentLiquid
Alex, what about the old attempts to remove the GIL, wasn't there a ton of overhead with that (a factor of 2 is what I recall)?
Bartosz Radaczyński
Yes Bartosz, Greg Stein did measure that in 1999. Garbage collection by reference counting was the killer, forcing huge overhead of fine grained locking. That's why a more advanced GC is crucial there.
Alex Martelli
The Unladen Swallow team has given up on removing the GIL:http://code.google.com/p/unladen-swallow/wiki/ProjectPlan#Global_Interpreter_Lock
Seun Osewa
@Alex - excellent!
orokusaki
+12  A: 

The JVM (at least hotspot) does have a similar concept to the "GIL", its just much finer in its lock granularity, most of this comes from the GC's in hotspot which are more advanced.

In CPython its one big lock (probably not that true, but good enough for arguments sake), in the JVM its more spread about with different concepts depending on where it is used.

Take a look at, for example, vm/runtime/safepoint.hpp in the hotspot code, which is effectively a barrier. Once at a safepoint the entire VM has stopped with regard to java code, much like the python VM stops at the GIL.

In the Java world such VM pausing events are known as "stop-the-world", at these points only native code that is bound to certain criteria is free running, the rest of the VM has been stopped.

Also the lack of a course lock in java makes JNI much more difficult to write, as the JVM makes less guarantees about its environment for FFI calls, one of the things that cpython makes fairly easy (although not as easy as using ctypes)

Greg Bowyer
+2  A: 

There is a comment down below in this blog post http://www.grouplens.org/node/244 that hints at the reason why it was so easy dispense with a GIL for IronPython or Jython, it is that CPython uses reference counting whereas the other 2 VMs have garbage collectors.

The exact mechanics of why this is so I don't get, but it does sounds like a plausible reason.

When you're promiscuously sharing objects between threads, working out when nobody has a reference to a particular object any more is moderately awkward. Reference counting with a global lock is one (expensive) way. A different way to solve it would have been to only let one thread at a time hold references to the object, which would make most activity be thread-local at a cost of making inter-thread communications more awkward. Personally, I think it's telling that HPC uses message-passing between processors and not shared memory, and that it does so for scalability reasons...
Donal Fellows