So I knocked up some test code to see how the multiprocessing module would scale on cpu bound work compared to threading. On linux I get the performance increase that I'd expect:
linux (dual quad core xeon): serialrun took 1192.319 ms parallelrun took 346.727 ms threadedrun took 2108.172 ms
My dual core macbook pro shows the same behavior:
osx (dual core macbook pro) serialrun took 2026.995 ms parallelrun took 1288.723 ms threadedrun took 5314.822 ms
I then went and tried it on a windows machine and got some very different results.
windows (i7 920): serialrun took 1043.000 ms parallelrun took 3237.000 ms threadedrun took 2343.000 ms
Why oh why, is the multiprocessing approach so much slower on windows?
Here's the test code:
#!/usr/bin/env python import multiprocessing import threading import time def print_timing(func): def wrapper(*arg): t1 = time.time() res = func(*arg) t2 = time.time() print '%s took %0.3f ms' % (func.func_name, (t2-t1)*1000.0) return res return wrapper def counter(): for i in xrange(1000000): pass @print_timing def serialrun(x): for i in xrange(x): counter() @print_timing def parallelrun(x): proclist = [] for i in xrange(x): p = multiprocessing.Process(target=counter) proclist.append(p) p.start() for i in proclist: i.join() @print_timing def threadedrun(x): threadlist = [] for i in xrange(x): t = threading.Thread(target=counter) threadlist.append(t) t.start() for i in threadlist: i.join() def main(): serialrun(50) parallelrun(50) threadedrun(50) if __name__ == '__main__': main()