I need to find the frequency of a sample, stored (in vb) as an array of byte. Sample is a sine wave, known frequency, so I can check), but the numbers are a bit odd, and my maths-foo is weak. Full range of values 0-255. 99% of numbers are in range 235 to 245, but there are some outliers down to 0 and 1, and up to 255 in the remaining 1%. How do I normalise this to remove outliers, (calculating the 235-245 interval as it may change with different samples), and how do I then calculate zero-crossings to get the frequency? Apologies if this description is rubbish!
Use the Fourier transform, it's much more noise insensitive than counting zero crossings
Edit: @WaveyDavey
I found an F# library to do an FFT: From here
As it turns out, the best free implementation that I've found for F# users so far is still the fantastic FFTW library. Their site has a precompiled Windows DLL. I've written minimal bindings that allow thread-safe access to FFTW from F#, with both guru and simple interfaces. Performance is excellent, 32-bit Windows XP Pro is only up to 35% slower than 64-bit Linux.
Now I'm sure you can call F# lib from VB.net, C# etc, that should be in their docs
The standard method to attack this problem is to consider one block of data, hopefully at least twice the actual frequency (taking more data isn't bad, so it's good to overestimate a bit), then take the FFT and guess that the frequency corresponds to the largest number in the resulting FFT spectrum.
By the way, very similar problems have been asked here before - you could search for those answers as well.
But finding info on "how to do FFT as a noob in a cruddy language like vb for dummies 101" is quite hard! Some dive into abstruse math that leaves my (undergrad, 1982) math abilities floundering. Was hoping for a Howto for Noobs, to be honest.
The FFT is probably the best answer, but if you really want to do it by your method, try this:
To normalize, first make a histogram to count how many occurrances of each value from 0 to 255. Then throw out X percent of the values from each end with something like:
for (i=lower=0;i< N*(X/100); lower++)
i+=count[lower];
//repeat in other direction for upper
Now normalize with
A[i] = 255*(A[i]-lower)/(upper-lower)-128
Throw away results outside the -128..127 range.
Now you can count zero crossings. To make sure you are not fooled by noise, you might want to keep track of the slope over the last several points, and only count crossings when the average slope is going the right way.
I googled for "basic fft". Visual Basic FFT Your question screams FFT, but be careful, using FFT without understanding even a little bit about DSP can lead results that you don't understand or don't know where they come from.
If I understood well from your description, what you have is a signal which is a combination of a sine plus a constant plus some random glitches. Say, like
x[n] = A*sin(f*n + phi) + B + N[n]
where N[n] is the "glitch" noise you want to get rid of.
If the glitches are one-sample long, you can remove them using a median filter which has to be bigger than the glitch length. On both sides of the glitch. Glitches of length 1, mean you will have enough with a median of 3 samples of length.
y[n] = median3(x[n])
The median is computed so: Take the samples of x you want to filter (x[n-1],x[n],x[n+1]), sort them, and your output is the middle one.
Now that the noise signal is away, get rid of the constant signal. I understand the buffer is of a limited and known length, so you can just compute the mean of the whole buffer. Substract it.
Now you have your single sinus signal. You can now compute the fundamental frequency by counting zero crossings. Count the amount of samples above 0 in which the former sample was below 0. The period is the total amount of samples of your buffer divided by this, and the frequency is the oposite (1/x) of the period.
Man, I love you guys! I have a complaint, also: How long has stakoverflow.com been here, AND WHY WASN'T I TOLD ABOUT IT! This place is programmers mecca. Do you guys never sleep ? It's like entering a small place in programmers heaven. I'll set accepted answer when I've worked thru the above replies.
Although I would go with the majority and say that it seems like what you want is an fft solution (fft algorithm is pretty quick), if fft is not the answer for whatever reason you may want to try fitting a sine curve to the data using a fitting program and reading off the fitted frequency.
Using Fityk, you can load the data, and fit to a*sin(b*x-c)
where 2*pi/b
will give you the frequency after fitting.
Fityk can be used from a gui, from a command-line for scripting and has a C++ API so could be included in your programs directly.
get the Frequency Analyzer at http://www.relisoft.com/Freeware/index.htm and run it and look at the code.