ansaurus

Question

mutliprocessing.Pool.add_sync() eating up memory

Answer 1

A:

The del statement deletes object references, so can free up memory when the garbage collector runs.

from itertools import izip
from multiprocessing import Pool

p = Pool()
for i, j in izip(hugeseta, hugesetb):
    p.apply_async(number_crunching, (i, j))

del i, j

zdav 2010-06-08 21:24:54

Where would I put del?I tried checking the pool for dead workers, but there are never any more workers than cores.So where are all the _i_ s and _j_ s being stored?

Austin 2010-06-08 21:40:50

@Austin Just `del` i and j as soon as you are done with them.

zdav 2010-06-08 22:28:03

@Zdav I meant while the loop is running. If I run without Pool, memory use flattens out. If I run it with Pool, old _i_ s and _j_ s don't get garbage collected.

Austin 2010-06-09 14:05:20

Answer 2

A:

Not really an answer but I used Pool.imap()instead:

for i in p.imap(do, izip(Fastitr(seqsa, filetype='fastq'), \
        Fastitr(seqsb, filetype='fastq'))):
    pass

Which works beautifully and garbage collects as expected however it feels funny having a for loop with nothing but pass actually do something useful.

Austin 2010-06-09 14:39:38

ansaurus

tags:

views:

answers:

mutliprocessing.Pool.add_sync() eating up memory

related questions