views:

670

answers:

5

Hello,

Sorry to ask a similar question to the one i asked before (FFT Problem (Returns random results)), but i've looked up pitch detection and autocorrelation and have found some code for pitch detection using autocorrelation.

Im trying to do pitch detection of a users singing. Problem is, it keeps returning random results. I've got some code from http://code.google.com/p/yaalp/ which i've converted to C++ and modified (below). My sample rate is 2048, and data size is 1024. I'm detecting pitch of both a sine wave and mic input. The frequency of the sine wave is 726.0, and its detecting it to be 722.950820 (which im ok with), but its detecting the pitch of the mic as a random number from around 100 to around 1050.

I'm now using a High pass filter to remove the DC offset, but it's not working. Am i doing it right, and if so, what else can i do to fix it? Any help would be greatly appreciated!

(Fixed)

Thanks,

Niall.

Edit: Changed the code to implement a high pass filter with a cutoff of 30hz (from What Are High-Pass and Low-Pass Filters?, can anyone tell me how to convert the low-pass filter using convolution to a high-pass one?) but it's still returning random results. Plugging it into a VST host and using VST plugins to compare spectrums isn't an option to me unfortunately.

Edit: Fixed, thanks for everyones help, but I never got it to work, now using new code.

A: 

I don't see the problem in you code, but I'm no good in C. But I'd try the following to find the problem:

  • run with data where the result in known, e.g. with sin(x) as input
  • run it with small data size (e.g. 2)

Compare the results with known correct ones. You should be able to find those on the internet, or do them by hand.

If random means: same input, different output, you most probably have some bug in the initialisation of variables. Use a debugger and known input to check, that all variables, especially all elements of arrays are properly initialized.

Jens Schauder
Input from a sine wave results in a more or less accurate result, but input from the mic results in a random result from about 100 to about 1050. But i've checked that the data from the mic is correct.
Niall
+1  A: 

The problem is in your findBestCandidates() function:

Inside this function you access the 'inputs' array from 0 up to 'length - 1'. When you call this function inside detectPitchCalculation() function 'inputs' is 'results' and 'length' is 'nHiPeriodInSamples'. But 'results' is only allocated and filled up to 'nHiPeriodInSamples - nLowPeriodInSamples - 1'. So if 'nLowPeriodInSamples' is greater 0 you access unallocated and random memory inside the findBestCandidates() function!

EDIT:

Another bug is that you fill each 'nResolution' entry of the 'results' array in detectPitchCalculation() function but access each entry in the findBestCandidates() function (via the 'inputs' argument). But since you call detectPitchCalculation() with a 'nResolution=1' this does not explain your specific problem...so I will look a little bit more. But it would definitely a problem if you call it with higher resolutions.

rstevens
I've changed nHiPeriodInSamples to nHiPeriodInSamples - nLowPeriodInSamples - 1, but it's still returning random values for mic input.
Niall
Leave out the '- 1' because the length is 'nHiPeriodInSamples - nLowPeriodInSamples' which means that you can access indexes from 0 up to 'nHiPeriodInSamples - nLowPeriodInSamples - 1'. But this wont solve your random problem I will take another look at your program.
rstevens
+2  A: 

I am no sound expert, but if you are sampling with 44100 (I guess samples per second) and use 1024 datapoints. You are working with about 1/40th of a second worth of data. I doesn't surprise me that the current pitch varies a lot, depending on which piece you pick. If you want to find the average or main pitch of a voice, I'd expect to need about 1second worth of data.

Jens Schauder
So, would more samples per second give a more or less accurate result?
Niall
@Nail: it seems to me that @Jens would be correct in suggesting that you need a lot more samples than 1024. If I'm not mistaken, you received similar indication from @avakar on your previous question: http://stackoverflow.com/questions/1351381/fft-problem-returns-random-results/1351398#1351398
Miky Dinescu
+1  A: 

At 44.1 kHz sampling frequency, 1024 samples is only a little bit over 23 ms worth of data. Isn't it possible that this is simply insufficient data in order to compute the pitch of a human singer?

I mean, the sound I can make that lasts for 23 ms is probably not something I have a lot of pitch-control over; I would expect this kind of measurement to be done over slighly longer periods of time.

unwind
So, would more samples per second give a more or less accurate result?
Niall
My dear, you should have at least a cursory glance of understanding before trying to codify something! More samples -> longer time; more samples per second: less time for the same amount of samples. Less samples per second -> more time for a given amount of samples.
gimpf
A: 

can anyone please tell whats the input data(passed as argument to DetectPitch() function) in the program above...??

and can u plz write the main() function...without tht the program can't be seen how it'll work

harsh
This is not an answer - it should be a comment or a separate question.
Paul R