ansaurus

Question

Detecting periodic repetitions in the data stream

Answer 1

+3 A:

Have you considered using autocorrelation?

Eli Bendersky 2010-04-08 09:53:49

not really, good point, i'll investigate. thanks!

pulegium 2010-04-08 10:00:56

Answer 2

A:

As an analytical technique acf/pacf/ccf are used to identify periodicity in time-dependent signals, hence the correlogram, the graphical display of acf or pacf, displays self-similarity in a signal as a function of different lags. (So for instance, if you see values on the y axis peak at a lag of 12, and if your date is in months, that's evidence of annual periodicity.)

To calculate and plot 'similarity' versus lag, if you don't want to roll your own, i am not aware of a native Numpy/Scipy option; I also couldn't find one in the 'time series' scikit (one of the libraries in the Scipy 'Scikits', domain-specific modules not included in the standard Scipy distribution) but it's worth checking again. The other option is to install Python bindings to R (RPy2, available on SourceForge) which will allow you to access the relevant R functions, including 'acf' which will calculate and plot the correlogram just by passing in your time series and calling the function.

On the other hand, if you want to identify continuous (unbroken) streams of a given type in your signal, then "run-length encoding" is probably what you want:

import numpy as NP
signal = NP.array([3,3,3,3,3,3,3,3,3,0,0,0,0,0,0,0,0,0,0,7,7,7,7,7,4,4,1,1,1,1,1,1,1])
px, = NP.where(NP.ediff1d(signal) != 0)
px = NP.r_[(0, px+1, [len(signal)])]
# collect the run-lengths for each unique item in the signal
rx = [ (m, n, signal[m]) for (m, n) in zip(px[:-1], px[1:]) ]

# returns [(0, 9, 3), (9, 19, 0), (19, 24, 7), (24, 26, 4), (26, 33, 1)]
# so w/r/t first 3-tuple: '3' occurs continuously in the half-open interval 0 and 9, and so forth

doug 2010-04-08 11:49:10

well, but i actually want to detect the periodicity. so if I have [0,0,0,1,0,0,0,1,0,0,0,1] and the sampling rate is 10Hz the result I want is 2.5Hz... similarly if the two 'signals' are there (like [0,0,0,1,1,0,0,1,0,1,0,1] where I added another signal at 5, 10, etc) I want to get 2.5Hz and 2Hz. Don't really need specific values just peaks on a histogram or something like that...

pulegium 2010-04-08 12:13:37

ok, editing my answer in light of your clarification

doug 2010-04-08 12:55:36

ansaurus

tags:

views:

answers:

Detecting periodic repetitions in the data stream

related questions