views:

1293

answers:

3

I've been reading about Python's multiprocessing module. I still don't think I have a very good understanding of what it can do.

Let's say I have a quadcore processor and I have a list with 1,000,000 integers and I want the sum of all the integers. I could simply do:

list_sum = sum(my_list)

But this only sends it to one core.

Is it possible, using the multiprocessing module, to divide the array up and have each core get the sum of it's part and return the value so the total sum may be computed?

Something like:

core1_sum = sum(my_list[0:500000])          #goes to core 1
core2_sum = sum(my_list[500001:1000000])    #goes to core 2
all_core_sum = core1_sum + core2_sum        #core 3 does final computation

Any help would be appreciated.

+5  A: 

Welcome the world of concurrent programming.

What Python can (and can't) do depends on two things.

  1. What the OS can (and can't) do. Most OS's allocate processes to cores. To use 4 cores, you need to break your problem into four processes. This is easier than it sounds. Sometimes.

  2. What the underlying C libraries can (and can't) do. If the C libraries expose features of the OS AND the OS exposes features of the hardware, you're solid.

To break a problem into multiple processes -- especially in GNU/Linux -- is easy. Break it into a multi-step pipeline.

In the case of summing a million numbers, think of the following shell script. Assuming some hypothetical sum.py program that sums either a range of numbers or a list of numbers on stdin.

( sum.py 0 500000 & sum.py 50000 1000000 ) | sum.py

This would have 3 concurrent processes. Two are doing sums of a lot of numbers, the third is summing two numbers.

Since the GNU/Linux shells and the OS already handle some parts of concurrency for you, you can design simple (very, very simple) programs that read from stdin, write to stdout, and are designed to do small parts of a large job.

You can try to reduce the overheads by using subprocess to build the pipeline instead of allocating the job to the shell. You may find, however, that the shell builds pipelines very, very quickly. (It was written directly in C and makes direct OS API calls for you.)

S.Lott
I felt this answer showed a lot of ingenuity. No problem in CS can't be solved by simply adding a layer of indirection.
earino
@earino: OTOH, it did not answer the OP's question at all, which was specifically about "how do I use the multiprocessing module".
Martin v. Löwis
@Martin v. Löwis: True. IMO the larger issue (using all cores) is often more important than the question asked (using subprocess to use all cores). In some cases the question asked reflects a number of poor assumptions.
S.Lott
+7  A: 

Yes, it's possible to do this summation over several processes, very much like doing it with multiple threads:

from multiprocessing import Process, Queue

def do_sum(q,l):
    q.put(sum(l))

def main():
    my_list = range(1000000)

    q = Queue()

    p1 = Process(target=do_sum, args=(q,my_list[:500000]))
    p2 = Process(target=do_sum, args=(q,my_list[500000:]))
    p1.start()
    p2.start()
    r1 = q.get()
    r2 = q.get()
    print r1+r2

if __name__=='__main__':
    main()

However, it is likely that doing it with multiple processes is likely slower than doing it in a single process, as copying the data forth and back is more expensive than summing them right away.

Martin v. Löwis
@Martin, I believe this deadlocks, per http://docs.python.org/library/multiprocessing.html#multiprocessing-programming : "a process that has put items in a queue will wait before terminating until all the buffered items are fed by the “feeder” thread to the underlying pipe" -- the example of deadlock the docs give is very similar to your code (it's a single subprocess in the start, join, get sequence) and two subprocesses instead of one don't help. Swap the joins and gets, or just remove the joins.
Alex Martelli
"It worked for me", probably because the data simply fit into the pipe. In any case, I have removed the joins.
Martin v. Löwis
Are you running this on Linux?
Casey
Yes, I did run that on Linux.
Martin v. Löwis
doesn't run on windows unless you put if __name__ == "__main__"
Casey
Thanks, I have this fixed now.
Martin v. Löwis
+2  A: 

Sure, for example:

from multiprocessing import Process, Queue

thelist = range(1000*1000)

def f(q, sublist):
    q.put(sum(sublist))

def main():
    start = 0
    chunk = 500*1000
    queue = Queue()
    NP = 0
    subprocesses = []
    while start < len(thelist):
      p = Process(target=f, args=(queue, thelist[start:start+chunk]))
      NP += 1
      print 'delegated %s:%s to subprocess %s' % (start, start+chunk, NP)
      p.start()
      start += chunk
      subprocesses.append(p)
    total = 0
    for i in range(NP):
      total += queue.get()
    print "total is", total, '=', sum(thelist)
    while subprocesses:
      subprocesses.pop().join()

if __name__ == '__main__':
    main()

results in:

$ python2.6 mup.py 
delegated 0:500000 to subprocess 1
delegated 500000:1000000 to subprocess 2
total is 499999500000 = 499999500000

note that this granularity is too fine to be worth spawning processes for -- the overall summing task is small (which is why I can recompute the sum in main as a check;-) and too many data is being moved back and forth (in fact the subprocesses wouldn't need to get copies of the sublists they work on -- indices would suffice). So, it's a "toy example" where multiprocessing isn't really warranted. With different architectures (use a pool of subprocesses that receive multiple tasks to perform from a queue, minimize data movement back and forth, etc, etc) and on less granular tasks you could actually get benefits in terms of performance, however.

Alex Martelli