views:

72

answers:

2

I was trying to find the quickest way to count the number of items in a list matching a specific filter. In this case, finding how many odd numbers there are in a list.

While doing this, I was surprised by the results of comparing a list comprehension vs the equivalent generator expression:

python -m timeit -s "L = xrange(1000000)" "sum([1 for i in L if i & 1])"
10 loops, best of 3: 109 msec per loop

python -m timeit -s "L = xrange(1000000)" "sum(1 for i in L if i & 1)"
10 loops, best of 3: 125 msec per loop

I have also tried with L being a regular list, and different sizes, but in all cases the list comprehension wins.

What is the genexp doing that causes it to be slower compared to the listcomp that creates a new list with 1 million items...?

(Btw, the fastest way I found was: x = 1; len(filter(x.__and__, L)). And yes, I know writing code like that kills kittens, I'm doing it for the fun of it)

+11  A: 

When essentially unlimited memory is available (which will invariably be the case in tiny benchmarks, although often not in real-world problems!-), lists will tend to outperform generators because they can get allocated just once, in one "big bunch" (no memory fragmentation, etc), while generators require (internally) extra effort to avoid that "big bunch" approach by preserving the stack-frame state to allow resumption of execution.

Whether a list-approach or generator-approach will be faster in a real program depends on the exact memory situation, including fragmentation, which is about impossible to reproduce accurately in a "micro-benchmark". IOW, in the end, if you truly care about performance, you must carefully benchmark (and, separately, profile) your actual program(s), not just "toy" micro-benchmarks, in the general case.

Alex Martelli
1+. It may also be noted that in many cases generators may use less memory because of their stream like nature. Consider reading each line in a file to a list and compare that against read each line, working with it and discarding it.
Skurmedel
+2  A: 

From what I remember, a generator frame have to be activated for each result, whereas the list comprehension uses the one activation frame. The incremental cost in the list compression is the added cost of the memory -- references to int in your case. The relation may well reverse if each item is a new instance and uses more memory.

update: After testing, it did reverse

~% python -m timeit -s "L = xrange(1000000);oint=type('intEx', (int,),{})" "sum([oint(1) for i in L if i & 1])" 
10 loops, best of 3: 414 msec per loop

~% python -m timeit -s "L = xrange(1000000);oint=type('intEx', (int,),{})" "sum(oint(1) for i in L if i & 1)" 
10 loops, best of 3: 392 msec per loop
Shane Holloway