ansaurus

Question

Speed up NumPy loop

Answer 1

+2 A:

If you're not limited to strict Python, I'd suggest using Cython for this. It can allow static typing of the indices and efficient, direct access to a numpy array's underlying data buffer at c speed.

Check out a short tutorial/example at http://wiki.cython.org/tutorials/numpy

In that example, which is doing operations very similar to what you're doing (incrementing indices, accessing individual elements of numpy arrays), adding type information to the index variables cut the time in half compared to the original. Adding efficient indexing into the numpy arrays by giving them type information cut the time to about 1% of the original.

Most Python code is already valid Cython, so you can just use what you have and add annotations and type information where needed to give you some speed-ups.

I suspect you'd get the most out of adding type information your indices x, y, orig_x and the numpy arrays.

Sancho 2010-10-22 21:09:41

Answer 2

+3 A:

I do agree with Sancho that Cython will probably be the way to go, but here are a couple of small speed-ups:

A. Store grid[y, orig_x] in some variable before you start the while loop and use that variable instead. This will save a bunch of look-up calls to the grid array.

B. Since you are basically just starting at x_len - 1 in shadow_map and working down to 1, you can avoid using the modulus so much. Basically, change:

while x != orig_x:
    height_needed = shadow_map[( (x - orig_x) % x_len)]

to

for i in xrange(x_len-1,0,-1):
    height_needed = shadow_map[i]

or just get rid of the height_needed variable all together with:

    if grid[y, x] - grid[y, orig_x] >= shadow_map[i]:

These are small changes, but they might help a little bit.

Also, if you plan on going the Cython route, I would consider having your function do this process for the whole grid, or at least a row at a time. That will save a lot of the function call overhead. However, you might not be able to really do this depending on how you are using the results.

Lastly, have you tried using Psyco? It takes less work than Cython though it probably won't give you quite as big of a speed boost. I would certainly try it first.

Justin Peel 2010-10-23 05:41:39

Answer 3

+1 A:

The following guide compares several different approaches to optimising numerical code in python:

Scipy PerformancePython

It is a bit out of date, but still helpful. Note that it refers to pyrex, which has since been forked to create the Cython project, as mentioned by Sancho.

Personally I prefer f2py, because I think that fortran 90 has many of the nice features of numpy (e.g. adding two arrays together with one operation), but has the full speed of compiled code. On the other hand if you don't know fortran then this may not be the way to go.

I briefly experimented with cython, and the trouble I found was that by default cython generates code which can handle arbitrary python types, but which is still very slow. You then have to spend time adding all the necessary cython declarations to get it to be more specific and fast, whereas if you go with C or fortran then you will tend to get fast code straight out of the box. Again this is biased by me already being familiar with these languages, whereas Cython may be more appropriate if Python is the only language you know.

DaveP 2010-10-26 03:44:28

ansaurus

tags:

views:

answers:

Speed up NumPy loop

related questions