views:

252

answers:

2

Aside from the ease of use of the multiprocessing module when it comes to hooking up processes with communication resources, are there any other differences between spawning multiple processes using multiprocessing compared to using subprocess to launch separate Python VMs ?

+2  A: 

If you ignore any communication issues (i.e., if the separate Python VMs do not communicate among themselves, or communicate only through other mechanisms that are explicitly established), there are no other substantial differences. (I believe multiprocessing, under certain conditions -- Unix-like platforms, in particular -- can use the more efficient fork rather than the fork-exec pair always implied by multiprocessing -- but that's not "substantial" when just a few processes are involved [[IOW, the performance difference on startup will not be material to the performance of the whole system]]).

Alex Martelli
Thanks again Alex!
jldupont
+4  A: 

On Posix platforms, multiprocessing primitives essentially wrap an os.fork(). What this means is that at point you spawn a process in multiprocessing, the code already imported/initialized remains so in the child process.

This can be a boon if you have a lot of things to initialize and then each subprocess essentially performs operations on (copies of) those initialized objects, but not all that helpful if the thing you run in the subprocess is completely unrelated.

There are also implications for resources such as file-handles, sockets, etc with multiprocessing on a unix-like platform.

Meanwhile, when using subprocess, you are creating an entirely new program/interpreter each time you Popen a new process. This means there can be less shared memory between them, but it also means you can Popen into a completely separate program, or a new entry-point into the same program.

On Windows, the differences are less between multiprocessing and subprocess, because windows does not provide fork().

Crast
good points... thanks!
jldupont