views:

412

answers:

4

There has to be a faster way of doing this.

There is a lot going on here, but it's fairly straightforward to unpack.

Here is the relevant python code (from scipy import *)

for i in arange(len(wav)):
    result[i] = sum(laser_flux * exp(-(wav[i] - laser_wav)**2) )

There are a bunch of arrays.

  • result -- array of length (wav)
  • laser_flux -- array of length (laser)
  • wav -- array of length (wav)
  • laser_wav -- array of length (laser)

Yes, within the exponential, I am squaring (element by element) the difference between the scalar value and the array of laser_wav.

Everything works as expected (including SLOWLY) any help you can give me to eliminate this for-loop would be much appreciated!

A: 

For one thing, it seems to be slightly quicker to multiply a variable by itself than to use the ** power operator:

~$ python -m timeit -n 100000 -v "x = 4.1; x * x"
raw times: 0.0904 0.0513 0.0493
100000 loops, best of 3: 0.493 usec per loop
~$ python -m timeit -n 100000 -v "x = 4.1; x**2"
raw times: 0.101 0.147 0.118
100000 loops, best of 3: 1.01 usec per loop
Mark Rushakoff
If you're doing optimizations at this level, you're really close to where you should be writing a native module to handle the math. There might be higher-level optimizations to be made, though (waiting on sample data before I look)...
Glenn Maynard
+2  A: 

I'm new to Python, so this may not the most optimal in Python, but I'd use the same technique for Perl, Scheme, etc.

def func(x):
    delta = x - laser_wav
    sum(laser_flux * exp(-delta * delta))
result = map(func, wav)
Chris Jester-Young
wouldn't the function call to sq(x) add some time? it might be more to write but if you're really worried about speed it seems like actually writing out the x*x would be better than calling the function once.
Casey
I believe it's generally considered more pythonic thse days to use a list comprehension instead of `map()`. But I can't really see either solution providing a _significant_ (or even measurable?) speed increase..
John Fouhy
@Casey: I've removed sq, for that reason.@John: You're probably right, but I'm really unfamiliar with list comprehensions, because I'm much more used to languages with map (Perl, Scheme, etc.). But thanks for the tip!
Chris Jester-Young
result = [func(x) for x in wav]
Casey
+10  A: 

You're going to want to use Numpy arrays (if you're not already) to store your data. Then, you can take advantage of array broadcasting with np.newaxis. For each value in wav, you're going to want to compute a difference between that value and each value in laser_wav. That suggests that you'll want a two-dimensional array, with the two dimensions being the wav dimension and the laser dimension.

In the example below, I'll pick the first index as the laser index and the second index as the wav index. With sample data, this becomes:

import numpy as np

LASER_LEN  = 5
WAV_LEN    = 10
laser_flux = np.arange(LASER_LEN)
wav        = np.arange(WAV_LEN)
laser_wav  = np.array(LASER_LEN)

# Tile wav into LASER_LEN rows and tile laser_wav into WAV_LEN columns
diff    = wav[np.newaxis,:] - laser_wav[:,np.newaxis]
exp_arg = -diff ** 2
sum_arg = laser_flux[:,np.newaxis] * np.exp(exp_arg)

# Now, the resulting array sum_arg should be of size (LASER_LEN,WAV_LEN)
# Since your original sum was along each element of laser_flux/laser_wav, 
# you'll need to sum along the first axis.
result = np.sum(sum_arg, axis=0)

Of course, you could just condense this down into a single statement:

result = np.sum(laser_flux[:,np.newaxis] * 
                np.exp(-(wav[np.newaxis,:]-laser_wav[:,np.newaxis])**2),axis=0)

Edit:

As noted in the comments to the question, you can take advantage of the "sum of multiplications" inherent in the definition of linear-algebra style multiplications. This then becomes:

result = np.dot(laser_flux, 
    np.exp(-(wav[np.newaxis,:] - laser_wav[:,np.newaxis])**2))
Tim Whitcomb
It's also worth noting that using numpy will be **much** faster than any pure Python alternative. numpy executes its calculations in C.
ire_and_curses
That is *definitely* true - I noticed an enormous speedup when I made this type of change in one of my applications.
Tim Whitcomb
A: 

If raw performance is an issue, you might benefit from rewriting to take advantage of multiple cores, if you have them.

from multiprocessing import Pool
p = Pool(5) # about the number of cores you have

def f(i):
    delta = wav[i] - laser_wav
    return sum(laser_flux * exp(-delta*delta) )

result = p.map(f, arange(len(wav)) )
TokenMacGuy