views:

220

answers:

5

Hi

just considering starting to learning python but I have one concern before I invest more time. Let me phrase this as a statement followed by a concern for others to comment on as perhaps the assumptions in the statement are invalid:

I have read about GIL and the consensus seems to be if you require concurrent solutions in python your best bet is to fork a new process to avoid GIL.

My concern is that if I have a problem I'd like to split into N*2 pieces across N processors (assume for example I have a single server running a *nix o/s with say 8 cores) I will incur context switching penalties between processes rather than between threads, which is more costly, which will limit performance.

I ask this because other languages are out there that claim to excel in such a scenario and I wonder is python appropriate for this arena.

+1  A: 

In my limited experience, the "context switch cost" is overrated as a performance limitation.

I/O bandwidth and memory are the most common limiting factors. Python's I/O is comparable to many other languages, since it simply uses the standard C libraries pretty directly.

Your actual problem may not be typical. However, many problems work out really well in multi-processing mode because they're actually I/O bound. Often it's filesystem, web page reading or Database operations that limit performance long before context switches.

S.Lott
You specifically mention using *nix - in which case you should be fine. The context switch cost is minimal, apparently it's a bigger issue on Windows. I routinely run up systems with 1000's of processes (of course not all Python, but that shouldn't matter (TM)) without any problems.Of course, I should qualify the above point, it's on headless Linux systems, configured as servers. No X11, Gnome, KDE, etc.
CyberED
+5  A: 

multiprocessing can get around the GIL, but it introduces its own issues such as communication between the processes.

Ignacio Vazquez-Abrams
+1  A: 

If you're considering learning Python for addressing this problem, I might suggest taking a look at Erlang instead. It has excellent support for very lightweight processes, and built-in primitives for IPC.

Not to discourage you from learning Python, of course, just suggesting there might be a better tool for this particular task.

TMN
+5  A: 

Python is not very good for CPU-bound concurrent programming. The GIL will (in many cases) make your program run as if it was running on a single core - or even worse. Even Unladen Swallow will (probably) not solve that problem (quote from their project plan: "we are no longer as optimistic about our chances of removing the GIL completely").

As you already stated, other languages claim to be better in concurrent programming. Haskell, for example, has built-in functionality for programming concurrent applications. You could also try C++ with OpenMP, which I think makes parallelization very simple.

If your application is I/O-bound, Python may be a serious solution as the GIL is normally released while doing blocking calls.

AndiDog
Ok, thanks AndiDog, interesting. Glen Maynard - I note your answer too. Reckon I may learn this as a successor to perl as my org makes python available on most servers and it seems expressive. However I probably won't spend ages learning the finer points and will spend that time on a functional langauge that scales better, eg haskell / erlang.
Ben
If you need raw CPU performance that badly, high-level scripting languages tend to be the wrong approach to begin with.
Glenn Maynard
+1  A: 

Also if you are looking at object sharing between the python processes, i suggest you look at the answer by Alex in this question

shibin