views:

3364

answers:

9

We use a data acquisition card to take readings from a device that increases its signal to a peak and then falls back to near the original value. To find the peak value we currently search the array for the highest reading and use the index to determine the timing of the peak value which is used in our calculations.

This works well if the highest value is the peak we are looking for but if the device is not working correctly we can see a second peak which can be higher than the initial peak. We take 10 readings a second from 16 devices over a 90 second period.

My initial thoughts are to cycle through the readings checking to see if the previous and next points are less than the current to find a peak and construct an array of peaks. Maybe we should be looking at a average of a number of points either side of the current position to allow for noise in the system. Is this the best way to proceed or are there better techniques?

+4  A: 

You could try signal averaging, i.e. for each point, average the value with the surrounding 3 or more points. If the noise blips are huge, then even this may not help.

I realise that this was language agnostic, but guessing that you are using LabView, there are lots of pre-packaged signal processing VIs that come with LabView that you can use to do smoothing and noise reduction. The NI forums are a great place to get more specialised help on this sort of thing.

Brendan
A: 

You could apply some Standard Devision to your logic and take notice of peaks over x%.

GateKiller
A: 

I think you want to cross-correlate your signal with an expected, exemplar signal. But, it has been such a long time since I studied signal processing and even then I didn't take much notice.

Anthony Cramp
A: 

I don't know very much about instrumentation, so this might be totally impractical, but then again it might be a helpful different direction. If you know how the readings can fail, and there is a certain interval between peaks given such failures, why not do gradient descent at each interval. If the descent brings you back to an area you've searched before, you can abandon it. Depending upon the shape of the sampled surface, this also might help you find peaks faster than search.

John the Statistician
A: 

We do use LabVIEW and I have checked the LAVA forums and there are a number of interesting examples. This is part of our test software and we are trying to avoid using too many non-standard VI libraries so I was hoping for feedback on the process/algorithms involved rather than specific code.

Swinders
+2  A: 

This problem has been studied in some detail.

There are a set of very up-to-date implementations in the TSpectrum* classes of ROOT (a nuclear/particle physics analysis tool). The code works in one- to three-dimensional data.

The ROOT source code is available, so you can grab this implementation if you want.

From the TSpectrum class documentation:

The algorithms used in this class have been published in the following references:

[1] M.Morhac et al.: Background elimination methods for multidimensional coincidence gamma-ray spectra. Nuclear Instruments and Methods in Physics Research A 401 (1997) 113- 132.

[2] M.Morhac et al.: Efficient one- and two-dimensional Gold deconvolution and its application to gamma-ray spectra decomposition. Nuclear Instruments and Methods in Physics Research A 401 (1997) 385-408.

[3] M.Morhac et al.: Identification of peaks in multidimensional coincidence gamma-ray spectra. Nuclear Instruments and Methods in Research Physics A 443(2000), 108-125.

The papers are linked from the class documentation for those of you who don't have a NIM online subscription.


The short version of what is done is that the histogram flattened to eliminate noise, and then local maxima are detected by brute force in the flattened histogram.

dmckee
A: 

This method is basically from David Marr's book "Vision"

Gaussian blur your signal with the expected width of your peaks. this gets rid of noise spikes and your phase data is undamaged.

Then edge detect (LOG will do)

Then your edges were the edges of features (like peaks). look between edges for peaks, sort peaks by size, and you're done.

I have used variations on this and they work very well.

Tim Williscroft
A: 

Is there a qualitative difference between the desired peak and the unwanted second peak? If both peaks are "sharp" -- i.e. short in time duration -- when looking at the signal in the frequency domain (by doing FFT) you'll get energy at most bands. But if the "good" peak reliably has energy present at frequencies not existing in the "bad" peak, or vice versa, you may be able to automatically differentiate them that way.

Adam Hollidge
+6  A: 

There are lots and lots of classic peak detection methods, any of which might work. You'll have to see what, in particular, bounds the quality of your data. Here are basic descriptions:

  1. Between any two points in your data, (x(0),y(0)) and (x(n),y(n)), add up y(i+1)-y(i) for 0 <= i < n and call this T ("travel") and set R ("rise") to y(n)- y(0) + k for suitably small k. T/R > 1 indicates a peak. This works OK if large travel due to noise is small or if noise distributes symmetrically around a base curve shape. For your application, accept the earliest peak with a score above a given threshold, or analyze the curve of travel per rise values for more interesting properties.

  2. Use matched filters to score similarity to a standard peak shape (essenitally, use a normalized dot-product against some shape to get a cosine-metric of similarity)

  3. Deconvolve against a standard peak shape and check for high values (though I often find 2 to be less sensitive to noise for simple instrumentation output).

  4. Smooth the data and check for triplets of equally space points where, if x0 < x1 < x2, y1 > 0.5*(y0+y2), or check Euclidean distances like this: D((x0,y0),(x1,y1)) + D((x1,y1),(x2,y2)) > D((x0,y0),(x2,y2)), which relies on the triagnle inequality. Using simple ratios will again provide you a scoring mechanism.

  5. Fit a very simple 2-gaussian mixture model to your data (for example, Numerical Recipes has a nice ready-made chunk of code). Take the earlier peak. This will deal correctly with overlapping peaks.

  6. Find the best match in the data to a simple gaussian, cauchy, poisson, or what-have-you curve. Evaluate this curve over a broad range and subtract it froma copy of the data after noting it's peak location. Repeat. Take the earliest peak whose model parameters (stddev probably, but some applications might care about kurtosis or other features) meet some criterion. Best match might be determined by the kind of match scoring suggested in #2 above.

I've done what you're doing before: finding peaks in DNA sequence data, finding peaks in derivatives estimated from measured curves, and finding peaks in histograms.

I encourage you to attend carefully to proper baselining. Wiener filtering or other filtering or simple histogram analysis is often an easy way to baseline in the presence of noise.

Finally, if you're data are typically noisy and you're getting data off the card as unreferenced single-ended output (or even referenced, just not differential), and if you're averaging lots of observations into each data point, try sorting those observations and throwing away the first and last quartile and averaging what remains. There are a host of such outlier elimination tactics that can be really useful.

Thomas Kammeyer
Wow, thanks for all this info. I have been a number of these tactics (poorly) to dry and solve my peak detection problem but I am going to take a better look at point 1. Thanks for this huge amount of data. John.
John Ballinger
This answer has much better language agnostic information - in particular I know the Numerical recipes code that fits a Gaussian works well as we use it here at work on FFT output.
Brendan