views:

300

answers:

7

Have done fft (see earlier posting if you are interested!) and got a result, which helps me. Would like to analyse the noisiness / spikiness of an array (actually a vb.nre collection of single). Um, how to explain ...

When signal is good, fft power results is 512 data points (frequency buckets) with low values in all but maybe 2 or 3 array entries, and a decent range (i.e. the peak is high, relative to the noise value in the nearly empty buckets. So when graphed, we have a nice big spike in the values in those few buckets.

When signal is poor/noisy, data values spread (max to min) is low, and there's proportionally higher noise in many more buckets.

What's a good, computationally non-intensive was of analysing the noisiness of this data set? Would some kind of statistical method, standard deviations or something help ?

A: 

calculate the signal to noise ratio http://en.wikipedia.org/wiki/Signal-to-noise_ratio

you could also check the stdev for each point and if it's under some level you choose then the signal is good else it's not.

Mladen
A: 

Thing is, I think a set of data with a range of say 250 max, 0 min, with one point at 250, and the rest at between 0 and 5 - wouldn't the spike be treated as a noise glitch in SNR, an outlier to be discarded, as it were ?

WaveyDavey
+1  A: 

The key is defining what is noise and what is signal, for which modelling assumptions must be made. Often an assumption is made of white noise (constant power per frequency band) or noise of some other power spectrum, and that model is fitted to the data. The signal to noise ratio can then be used to measure the amount of noise.

Fitting a noise model depends on the nature of your data: if you know that the real signal will have no power in the high frequency components, you can look there for an indication of the noise level, and use the model to predict what the noise will be at the lower frequency components where there is both signal and noise. Alternatively, if your signal is constant in time, taking multiple FFTs at different points in time and comparing them to get a standard deviation for each frequency band can give the level of noise present.

I hope I'm not patronising you to mention the issues inherent with windowing functions when performing FFTs: these can have the effect of introducing spurious "noise" into the frequency spectrum which is in fact an artifact of the periodic nature of the FFT. There's a tradeoff between getting sharp peaks and 'sideband' noise - more here www.ee.iitm.ac.in/~nitin/_media/ee462/fftwindows.pdf

Chris Johnson
A: 

Chris, I know so little about this subject, it would be virtually impossible to patronise me! FFT's make my brain bleed, and I'm now just trying to get my program to decide whether it had a clean sample, or a poor one to be ignored. Add to this the fact I'm writing in an unfamiliar language vb.net, using unfamiliar components (directSound) and you have a real recipe for confusion!

WaveyDavey
A: 

Calculate a standard deviation and then you decide the threshold that will indicate noise. In practice this is usually easy and allows you to easily tweak the "noise level" as needed.

There is a nice single pass stddev algorithm in Knuth. Here is link that describes an implementation.

Standard Deviation

Kevin Gale
A: 

wouldn't the spike be treated as a noise glitch in SNR, an outlier to be discarded, as it were?

If it's clear from the time-domain data that there are such spikes, then they will certainly create a lot of noise in the frequency spectrum. Chosing to ignore them is a good idea, but unfortunately the FFT can't accept data with 'holes' in it where the spikes have been removed. There are two techniques to get around this. The 'dirty trick' method is to set the outlier sample to be the average of the two samples on either site, and compute the FFT with a full set of data.

The harder but more-correct method is to use a Lomb Normalised Periodogram (see the book 'Numerical Recipes' by W.H.Press et al.), which does a similar job to the FFT but can cope with missing data properly.

Chris Johnson
No, it's the opposite - few, big spikes are a good thing! spikes are high values in the fft buckets for a given frequency. It's noisy, lots of spikes data that indicates a poor sample.
WaveyDavey
A: 

I think maybe it'd be better if I provided some graphical example of what I mean. Should I post links to the graphs of the fft results I'm getting ? This will show good samples, and bad samples, and illustrate what I'm trying to discard when I've done the fft of a sample.

WaveyDavey
Posting a link is a good idea. It's not really clear what you want to do with the results. Are you trying to denoise the data? Or are you just analysing signals? It might be easier to answer the question if we know what tradeoffs are acceptable.
Mendelt