Why are you optimizing this? Have you written working, tested code, then examined your algorithm profiled your code and found that optimizing this will have an effect? Are you doing this in a deep inner loop where you found you are spending your time? If not, don't bother.
You'll only know which works fastest for you by timing it. To time it in a useful way, you'll have to specialize it to your actual use case. For example, you can get noticeable performance differences between a function call in a list comprehension versus an inline expression; it isn't clear whether you really wanted the former or if you reduced it to that to make your cases similar.
You say that it doesn't matter whether you end up with a numpy array or a list
, but if you're doing this kind of micro-optimization it does matter, since those will perform differently when you use them afterward. Putting your finger on that could be tricky, so hopefully it will turn out the whole problem is moot as premature.
It is typically better to simply use the right tool for the job for clarity, readability, and so forth. It is rare that I would have a hard time deciding between these things.
- If I needed numpy arrays, I would use them. I would use these for storing large homogeneous arrays or multidimensional data. I use them a lot, but rarely where I think I'd want to use a list.
- If I was using these, I'd do my best to write my functions already vectorized so I didn't have to use
numpy.vectorize
. For example, times_five
below can be used on a numpy array with no decoration.
- If I didn't have cause to use numpy, that is to say if I wasn't solving numerical math problems or using special numpy features or storing multidimensional arrays or whatever...
- If I had an already-existing function, I would use
map
. That's what it's for.
- If I had an operation that fit inside a small expression and I didn't need a function, I'd use a list comprehension.
- If I just wanted to do the operation for all the cases but didn't actually need to store the result, I'd use a plain for loop.
- In many cases, I'd actually use
map
and list comprehensions' lazy equivalents: itertools.imap
and generator expressions. These can reduce memory usage by a factor of n
in some cases and can avoid performing unnecessary operations sometimes.
If it does turn out this is where performance problems lie, getting this sort of thing right is tricky. It is very common that people time the wrong toy case for their actual problems. Worse, it is extremely common people make dumb general rules based on it.
Consider the following cases (timeme.py is posted below)
python -m timeit "from timeme import x, times_five; from numpy import vectorize" "vectorize(times_five)(x)"
1000 loops, best of 3: 924 usec per loop
python -m timeit "from timeme import x, times_five" "[times_five(item) for item in x]"
1000 loops, best of 3: 510 usec per loop
python -m timeit "from timeme import x, times_five" "map(times_five, x)"
1000 loops, best of 3: 484 usec per loop
A naïve obsever would conclude that map is the best-performing of these options, but the answer is still "it depends". Consider the power of using the benefits of the tools you are using: list comprehensions let you avoid defining simple functions; numpy lets you vectorize things in C if you're doing the right things.
python -m timeit "from timeme import x, times_five" "[item + item + item + item + item for item in x]"
1000 loops, best of 3: 285 usec per loop
python -m timeit "import numpy; x = numpy.arange(1000)" "x + x + x + x + x"
10000 loops, best of 3: 39.5 usec per loop
But that's not all—there's more. Consider the power of an algorithm change. It can be even more dramatic.
python -m timeit "from timeme import x, times_five" "[5 * item for item in x]"
10000 loops, best of 3: 147 usec per loop
python -m timeit "import numpy; x = numpy.arange(1000)" "5 * x"
100000 loops, best of 3: 16.6 usec per loop
Sometimes an algorithm change can be even more effective. This will be more and more effective as the numbers get bigger.
python -m timeit "from timeme import square, x" "map(square, x)"
10 loops, best of 3: 41.8 msec per loop
python -m timeit "from timeme import good_square, x" "map(good_square, x)"
1000 loops, best of 3: 370 usec per loop
And even now, this all may have little bearing on your actual problem. It looks like numpy is so great if you can use it right, but it has its limitations: none of these numpy examples used actual Python objects in the arrays. That complicates what must be done; a lot even. And what if we do get to use C datatypes? These are less robust than Python objects. They aren't nullable. The integers overflow. You have to do some extra work to retrieve them. They're statically typed. Sometimes these things prove to be problems, even unexpected ones.
So there you go: a definitive answer. "It depends."
# timeme.py
x = xrange(1000)
def times_five(a):
return a + a + a + a + a
def square(a):
if a == 0:
return 0
value = a
for i in xrange(a - 1):
value += a
return value
def good_square(a):
return a ** 2