tags:

views:

133

answers:

2

Let's say I have an array of zeros:

a = numpy.zeros(1000)

I then introduce some repetitive 'events':

a[range(0, 1000, 30)] = 1

Question is, how do I detect the 'signal' there? Because it's far from the ideal signal if I do the 'regular' FFT I don't get a clear indication of where my 'true' signal is:

f = abs(numpy.fft.rfft(a))

Is there a method to detect these repetitions with some degree of certainty? Especially if I have few of those mixed in, for example here:

a[range(0, 1000, 30)] = 1
a[range(0, 1000, 110)] = 1
a[range(0, 1000, 48)] = 1

I'd like to get three 'spikes' on the resulting data...

+3  A: 

Have you considered using autocorrelation?

Eli Bendersky
not really, good point, i'll investigate. thanks!
pulegium
A: 

As an analytical technique acf/pacf/ccf are used to identify periodicity in time-dependent signals, hence the correlogram, the graphical display of acf or pacf, displays self-similarity in a signal as a function of different lags. (So for instance, if you see values on the y axis peak at a lag of 12, and if your date is in months, that's evidence of annual periodicity.)

To calculate and plot 'similarity' versus lag, if you don't want to roll your own, i am not aware of a native Numpy/Scipy option; I also couldn't find one in the 'time series' scikit (one of the libraries in the Scipy 'Scikits', domain-specific modules not included in the standard Scipy distribution) but it's worth checking again. The other option is to install Python bindings to R (RPy2, available on SourceForge) which will allow you to access the relevant R functions, including 'acf' which will calculate and plot the correlogram just by passing in your time series and calling the function.

On the other hand, if you want to identify continuous (unbroken) streams of a given type in your signal, then "run-length encoding" is probably what you want:

import numpy as NP
signal = NP.array([3,3,3,3,3,3,3,3,3,0,0,0,0,0,0,0,0,0,0,7,7,7,7,7,4,4,1,1,1,1,1,1,1])
px, = NP.where(NP.ediff1d(signal) != 0)
px = NP.r_[(0, px+1, [len(signal)])]
# collect the run-lengths for each unique item in the signal
rx = [ (m, n, signal[m]) for (m, n) in zip(px[:-1], px[1:]) ]

# returns [(0, 9, 3), (9, 19, 0), (19, 24, 7), (24, 26, 4), (26, 33, 1)]
# so w/r/t first 3-tuple: '3' occurs continuously in the half-open interval 0 and 9, and so forth
doug
well, but i actually want to detect the periodicity. so if I have [0,0,0,1,0,0,0,1,0,0,0,1] and the sampling rate is 10Hz the result I want is 2.5Hz... similarly if the two 'signals' are there (like [0,0,0,1,1,0,0,1,0,1,0,1] where I added another signal at 5, 10, etc) I want to get 2.5Hz and 2Hz. Don't really need specific values just peaks on a histogram or something like that...
pulegium
ok, editing my answer in light of your clarification
doug