views:

132

answers:

3

Hi friends,

Does the presence of python GIL imply that in python multi threading the same operation is not so different from repeating it in a single thread?.

For example, If I need to upload two files, what is the advantage of doing them in two threads instead of uploading them one after another?.

I tried a big math operation in both ways. But they seem to take almost equal time to complete.

This seems to be unclear to me. Can someone help me on this?. Thanks.

+3  A: 

It depends on the native code module that's executing. Native modules can release the GIL and then go off and do their own thing allowing another thread to lock the GIL. The GIL is normally held while code, both python and native, are operating on python objects. If you want more detail you'll probably need to go and read quite a bit about it. :)

See: What is a global interpreter lock (GIL)? and Thread State and the Global Interpreter Lock

Qberticus
+1  A: 

It really depends on the library you're using. The GIL is meant to prevent Python objects and its internal data structures to be changed at the same time. If you're doing an upload, the library you use to do the actual upload might release the GIL while it's waiting for the actual HTTP request to complete (I would assume that is the case with the HTTP modules in the standard library, but I didn't check).

As a side note, if you really want to have things running in parallel, just use multiple processes. It will save you a lot of trouble and you'll end up with better code (more robust, more scalable, and most probably better structured).

ionut bizau
+1 for that last paragraph -- exactly!
Kevin Little
Should also mention that multiprocessing is the gateway to scaling across multiple machines anyway. Threads cannot get you there.
cjrh
+5  A: 

Python's threads get a slightly worse rap than they deserve. There are three (well, 2.5) cases where they actually get you benefits:

  • If non-Python code (e.g. a C library, the kernel, etc.) is running, other Python threads can continue executing. It's only pure Python code that can't run in two threads at once. So if you're doing disk or network I/O, threads can indeed buy you something, as most of the time is spent outside of Python itself.

  • The GIL is not actually part of Python, it's an implementation detail of CPython (the "reference" implementation that the core Python devs work on, and that you usually get if you just run "python" on your Linux box or something.

    Jython, IronPython, and any other reimplementations of Python generally do not have a GIL, and multiple pure-Python threads can execute simultaneously.

  • The 0.5 case: Even if you're entirely pure-Python and see little or no performance benefit from threading, some problems are really convenient in terms of developer time and difficulty to solve with threads. This depends in part on the developer, too, of course.

Nicholas Knight
Sure, the GIL is only a problem in CPython. Though I can't really imagine somebody moving to Jython or IronPython just because of that when there are better ways to get around it.
ionut bizau
There's also Unladen Swallow, which is Google's project to build a fast replacement for CPython.
Adam Luchjenbroers