views:

274

answers:

3

I'm performing a nested loop in python that is included below. This serves as a basic way of searching through existing financial time series and looking for periods in the time series that match certain characteristics. In this case there are two separate, equally sized, arrays representing the 'close' (i.e. the price of an asset) and the 'volume' (i.e. the amount of the asset that was exchanged over the period). For each period in time I would like to look forward at all future intervals with lengths between 1 and INTERVAL_LENGTH and see if any of those intervals have characteristics that match my search (in this case the ratio of the close values is greater than 1.0001 and less than 1.5 and the summed volume is greater than 100).

My understanding is that one of the major reasons for the speedup when using NumPy is that the interpreter doesn't need to type-check the operands each time it evaluates something so long as you're operating on the array as a whole (e.g. numpy_array * 2), but obviously the code below is not taking advantage of that. Is there a way to replace the internal loop with some kind of window function which could result in a speedup, or any other way using numpy/scipy to speed this up substantially in native python?

Alternatively, is there a better way to do this in general (e.g. will it be much faster to write this loop in C++ and use weave)?

ARRAY_LENGTH = 500000
INTERVAL_LENGTH = 15
close = np.array( xrange(ARRAY_LENGTH) )
volume = np.array( xrange(ARRAY_LENGTH) )
close, volume = close.astype('float64'), volume.astype('float64')

results = []
for i in xrange(len(close) - INTERVAL_LENGTH):
    for j in xrange(i+1, i+INTERVAL_LENGTH):
        ret = close[j] / close[i]
        vol = sum( volume[i+1:j+1] )
        if ret > 1.0001 and ret < 1.5 and vol > 100:
            results.append( [i, j, ret, vol] )
print results
+3  A: 

One speedup would be to remove the sum portion, as in this implementation it sums a list of length 2 through INTERVAL_LENGTH. Instead, just add volume[j+1] to the previous result of vol from the last iteration of the loop. Thus, you're just adding two integers each time instead of summing an entire list AND slicing it each time. Also, instead of starting by doing sum(volume[i+1:j+1]), just do vol = volume[i+1] + volume[j+1], as you know the initial case here will always be just two indices.

Another speedup would be to use .extend instead of .append, as the python implementation has extend running significantly faster.

You could also break up the final if statement so as to only do certain computation if required. For instance, you know if vol <= 100, you don't need to compute ret.

This doesn't answer your problem exactly, but I think especially with the sum issue that you should see significant speedups with these changes.

Edit - you also don't need len, since you know specifically the length of the list already (unless that was just for the example). Defining it as a number rather than len(something) is always faster.

Edit - implementation (this is untested):

ARRAY_LENGTH = 500000
INTERVAL_LENGTH = 15
close = np.array( xrange(ARRAY_LENGTH) )
volume = np.array( xrange(ARRAY_LENGTH) )
close, volume = close.astype('float64'), volume.astype('float64')

results = []
ex = results.extend
for i in xrange(ARRAY_LENGTH - INTERVAL_LENGTH):
    vol = volume[i+1]
    for j in xrange(i+1, i+INTERVAL_LENGTH):
        vol += volume[j+1]
        if vol > 100:
            ret = close[j] / close[i]
            if 1.0001 < ret < 1.5:
                ex( [i, j, ret, vol] )
print results
nearlymonolith
Another speedup would be defining `extend_results=results.extend" once (before the loop) and then using `extend_results([i, j, ret, vol])` inside the loop to avoid lookups. But always measure (with the `timeit` module when optimizing)!
ChristopheD
Interesting! How significant is the lookup time, generally? Is it usually a useful speedup, or is this more because of the magnitude of this particular loop?
nearlymonolith
@Anthony Morelli: Local variable lookups are a lot less time consuming then global or built-in variable lookups since the 'compiler' optimizes function bodies so local variables don't need dictionary lookups (also see: http://www.python.org/doc/essays/list2str.html). But in general benchmarking is always necessary since (unsuccesful) scope lookup times are influenced by the size of the items to consider etc. But with a loop this large, it's a safe bet (I think) to consider this technique worthwile (not tested, but it should at least be faster to some extent).
ChristopheD
@Anthony Morelli: What ChristopheD says PLUS: (1) all of your variables are GLOBAL; wrap the code in a function and call it (2) use `1.0001 < ret < 1.5` (3) `results.extend` statement wrongly indented
John Machin
Thanks! I made the changes - although 1.0001 < ret < 1.5 just extends out to ret > 1.0001 and ret < 1.5 when compiled, anyway (according to what I've read/remembered). Edit: Although I guess the variable `ret` would only have to be looked up once which is conceivably faster.
nearlymonolith
I'm pretty certain that there is a problem with the calculation of `vol` here. The slice is from i+1:j+1 in the initial code and j starts at i+1. To me, this means that `vol` should be initialized to 0 and `volume[j]` added for each iteration in the loop rather than `volume[j+1]`.
Justin Peel
+4  A: 

Update: (almost) completely vectorized version below in "new_function2"...

I'll add comments to explain things in a bit.

It gives a ~50x speedup, and a larger speedup is possible if you're okay with the output being numpy arrays instead of lists. As is:

In [86]: %timeit new_function2(close, volume, INTERVAL_LENGTH)
1 loops, best of 3: 1.15 s per loop

You can replace your inner loop with a call to np.cumsum()... See my "new_function" function below. This gives a considerable speedup...

In [61]: %timeit new_function(close, volume, INTERVAL_LENGTH)
1 loops, best of 3: 15.7 s per loop

vs

In [62]: %timeit old_function(close, volume, INTERVAL_LENGTH)
1 loops, best of 3: 53.1 s per loop

It should be possible to vectorize the entire thing and avoid for loops entirely, though... Give me an minute, and I'll see what I can do...

import numpy as np

ARRAY_LENGTH = 500000
INTERVAL_LENGTH = 15
close = np.arange(ARRAY_LENGTH, dtype=np.float)
volume = np.arange(ARRAY_LENGTH, dtype=np.float)

def old_function(close, volume, INTERVAL_LENGTH):
    results = []
    for i in xrange(len(close) - INTERVAL_LENGTH):
        for j in xrange(i+1, i+INTERVAL_LENGTH):
            ret = close[j] / close[i]
            vol = sum( volume[i+1:j+1] )
            if (ret > 1.0001) and (ret < 1.5) and (vol > 100):
                results.append( (i, j, ret, vol) )
    return results


def new_function(close, volume, INTERVAL_LENGTH):
    results = []
    for i in xrange(close.size - INTERVAL_LENGTH):
        vol = volume[i+1:i+INTERVAL_LENGTH].cumsum()
        ret = close[i+1:i+INTERVAL_LENGTH] / close[i]

        filter = (ret > 1.0001) & (ret < 1.5) & (vol > 100)
        j = np.arange(i+1, i+INTERVAL_LENGTH)[filter]

        tmp_results = zip(j.size * [i], j, ret[filter], vol[filter])
        results.extend(tmp_results)
    return results

def new_function2(close, volume, INTERVAL_LENGTH):
    vol, ret = [], []
    I, J = [], []
    for k in xrange(1, INTERVAL_LENGTH):
        start = k
        end = volume.size - INTERVAL_LENGTH + k
        vol.append(volume[start:end])
        ret.append(close[start:end])
        J.append(np.arange(start, end))
        I.append(np.arange(volume.size - INTERVAL_LENGTH))

    vol = np.vstack(vol)
    ret = np.vstack(ret)
    J = np.vstack(J)
    I = np.vstack(I)

    vol = vol.cumsum(axis=0)
    ret = ret / close[:-INTERVAL_LENGTH]

    filter = (ret > 1.0001) & (ret < 1.5) & (vol > 100)

    vol = vol[filter]
    ret = ret[filter]
    I = I[filter]
    J = J[filter]

    output = zip(I.flat,J.flat,ret.flat,vol.flat)
    return output

results = old_function(close, volume, INTERVAL_LENGTH)
results2 = new_function(close, volume, INTERVAL_LENGTH)
results3 = new_function(close, volume, INTERVAL_LENGTH)

# Using sets to compare, as the output 
# is in a different order than the original function
print set(results) == set(results2)
print set(results) == set(results3)
Joe Kington
A: 

Why don't you try to generate the result as a single list (much faster than appending or extending), something like:

results = [ t for t in ( (i, j, close[j]/close[i], sum(volume[i+1:j+1]))
                         for i in xrange(len(close)-INT_LEN)
                             for j in xrange(i+1, i+INT_LEN)
                       )
            if t[3] > 100 and 1.0001 < t[2] < 1.5
          ]
Nas Banov