views:

338

answers:

4

This is relating to the following: (In Python Code)

for i in object:
     doSomething(i)

versus

map(doSomething, object)

Both are easy to understand, and short, but is there any speed difference? Now, if doSomething had a return value we needed to check it would be returned as a list from map, and in the for loop we could either create our own list or check one at a time.

for i in object:
     returnValue = doSomething(i)
     doSomethingWithReturnValue(returnValue)

versus

returnValue = map(doSomething, object)
map(doSomethingWithReturnValue, returnValue)

Now, I feel the two diverge a little bit. The two doSomethingWithReturnValue functions may be different based on if checking them on the fly as we go through the loop or if checking them all at once at the end produce different results. Also it seems the for loop would always work, maybe slower, where the map would only work under certain scenarios. Of course, we could make contortions to make either work, but the whole point is to avoid this type of work.

What I'm looking for is a scenario where the mapping function truly shines in comparison to a well done for loop in performance, readability, maintainability, or speed of implementation. If the answer is there really isn't a big difference then I'd like to know when in practice people use one or the other or if it's really completely arbitrary and set by coding standards depending on your institution.

Thanks!

+4  A: 

Map is useful when you want to apply the function to every item of iterable and return a list of the results. This is simpler and more concise than usng a for loop and constructing a list.

For is often more readable for other situations, and in lisp there were lots of iteration constructs that were written basically using macros and map. So, in cases where map doesn't fit, use a for loop.

In theory, if we had a compiler/interpreter that was smart enough to make use of multiple cpus/processors, then map could be implemented faster as the different operations on each item could be done in parallel. I don't think this is the case at present, however.

Larry Watanabe
PLINQ (C#) can do that though.
Hamish Grubijan
Why the past tense? Lisp is alive and kicking.
Svante
Actually `map` beats `for` in performance even in single thread because the loop is written in C. See my post for speed test.
Iamamac
In general, map cannot be parallelized automatically, as each function call might have a global side effect. A parallel map already exists in Python 2.6+: multiprocessing.Pool.map().
EOL
@lamamac that's true, but list comprehension often beats map.
TM
+7  A: 

just use list comprehensions: they're more pythonic. They're also have syntax similar to generator expressions which makes it easy to switch from one to the other. You don't need to change anything when converting your code to py3k: map returns an iterable in py3k and you'll have to adjust your code.

if you don't care about return values just don't name the new list, you need to use return values once in your code you might switch to generator expressions and a single list comprehension at the end.

SilentGhost
A: 

EDIT: I didn't realize that map equals itertools.imap after python 3.0. So the conclusion here may not be correct. I'll re-run the test on python 2.6 tomorrow and post result.

If doSomething is very "tiny", map can be a lot faster than for loop or a list-comprehension:

# Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (Intel)] on win32

from timeit import timeit

do = lambda i: i+1

def _for():
  for i in range(1000):
    do(i)

def _map():
  map(do, range(1000))

def _list():
  [do(i) for i in range(1000)]

timeit(_for, number=10000)  # 2.5515936921388516
timeit(_map, number=10000)  # 0.010167432629884843
timeit(_list, number=10000) # 3.090125159839033

This is because map is written in C, while for loop and list-comprehension run in python virtual machine.

Iamamac
I don't know where you're getting your numbers from, but in my case (Python 2.6) the for loop is faster, by about 5%. Your code isn't even correct since _list does less iterations. The huge differences you got indicate that something is seriously wrong with your setup.
interjay
this is just ridiculous. your codes are not equivalent at all. read my answer. even w/o map object vs. list dichotomy your code is different, it's a shame really that you don't see it
SilentGhost
Sorry there is some typo error when I copied the code and did formatting. I runs python 3.1.1 on my Core Duo 4300 PC, `map` beats the other two significantly.
Iamamac
I ran this code in python 2.6.4 for number=100000 iterations, and I get about 15 for _for, 18 for _map, and 17 for _list.
indiv
A: 

Are you familiar with the timeit module? Below are some timings. -s performs a one-time setup, and then the command is looped and the best time recorded.

1> python -m timeit -s "L=[]; M=range(1000)" "for m in M: L.append(m*2)"
1000 loops, best of 3: 432 usec per loop

2> python -m timeit -s "M=range(1000);f=lambda x: x*2" "L=map(f,M)"
1000 loops, best of 3: 449 usec per loop

3> python -m timeit -s "M=range(1000);f=lambda x:x*2" "L=[f(m) for m in M]"
1000 loops, best of 3: 483 usec per loop

4> python -m timeit -s "L=[]; A=L.append; M=range(1000)" "for m in M: A(m*2)"
1000 loops, best of 3: 287 usec per loop    

5> python -m timeit -s "M=range(1000)" "L=[m*2 for m in M]"
1000 loops, best of 3: 174 usec per loop

Note they are all similar except for the last two. It is the function calls (L.append, or f(x)) that severely affect the timing. In #4 the L.append lookup has been done once in setup. In #5 a list comp with no function calls is used.

Mark Tolonen
I think you are referring to my post. Yes, I found the severe problem that `map` returns iterators in py3k, but I don't think there is anything wrong with `timeit`, `range` returns iterators so there is little impact not put it in the setup phase.
Iamamac
> python3 -m timeit "[m for m in range(1000)]"10000 loops, best of 3: 114 usec per loop> python3 -m timeit -s M=list(range(1000)) "[m for m in M]"10000 loops, best of 3: 83 usec per loopThere is a significant difference in building the list only once.
Mark Tolonen