In a pylab program (which could probably be a matlab program as well) I have a numpy array of numbers representing distances: d[t]
is the distance at time t
(and the timespan of my data is len(d)
time units).
The events I'm interested in are when the distance is below a certain threshold, and I want to compute the duration of these events. It's easy to get an array of booleans with b = d<threshold
, and the problem comes down to computing the sequence of the lengths of the True-only words in b
. But I do not know how to do that efficiently (i.e. using numpy primitives), and I resorted to walk the array and to do manual change detection (i.e. initialize counter when value goes from False to True, increase counter as long as value is True, and output the counter to the sequence when value goes back to False). But this is tremendously slow.
How to efficienly detect that sort of sequences in numpy arrays ?
Below is some python code that illustrates my problem : the fourth dot takes a very long time to appear (if not, increase the size of the array)
from pylab import *
threshold = 7
print '.'
d = 10*rand(10000000)
print '.'
b = d<threshold
print '.'
durations=[]
for i in xrange(len(b)):
if b[i] and (i==0 or not b[i-1]):
counter=1
if i>0 and b[i-1] and b[i]:
counter+=1
if (b[i-1] and not b[i]) or i==len(b)-1:
durations.append(counter)
print '.'