views:

179

answers:

1

As a follow-up to my previous question, if I want my smartphone application to detect a certain musical note, and I only need to know whether the incoming sound is that musical note or not, with a certain amount of fuzziness, to allow the note to be off-key by x cents.

Given that, is there a superior method over others for speed and accuracy? That is, by knowing that the note you are looking for is, say, a #C3, how best to tell if that note is present or not? I'm assuming that looking for a single note would be easier than separating out all waveforms, and then looking at the results for the fundamental frequency.

In the responses to my original question, one respondent suggested that autocorrelation might work well if you know that the notes are within a certain range. I wonder if autocorrelation would then work even better, if you only have to check for the presence or absence of a certain note (+/- x cents).

Those methods being:

  • Kiss FFT
  • FFTW
  • Discrete Wavelet Transform
  • autocorrelation
  • zero crossing analysis
  • octave-spaced filters
  • DWT

Any thoughts would be appreciated.

+1  A: 

As you describe it, you just need to determine if a particular pitch is present. A very simple (fast) detector would just record the equivalent of one period of the waveform, then record another period and correlate them, like an oversimplified (single-lag) autocorrelation. If there's a high match, you know the waveform being recorded is repeating at around the same period, or a harmonic of it.

For instance, to detect 1 kHz, record 1 ms of audio (48 samples at 48 kHz), then record another 1 ms, and compare them (correlate = multiply all samples and sum). If they line up (correlation above some threshold), then you're listening to 1 kHz, 2 kHz, 3 kHz, or some other multiple. Doing several periods would give you more confidence on the match.

A true autocorrelation would tell you which harmonic, specifically, if that's important to you.

endolith
This sounds like a fast way to do it, but I would like to test any of 50 or so notes over 3 or 4 octaves. Actually, I would like to have some level of "fuzziness" as set by the user, so that the notes could be off by some amount of cents. Does that mean it might be better to just do an FFT and look at the resultant frequencies, rather than use autocorrelation.
mahboudz
Autocorrelation would be better, I think, since it matches the entire wave shape. With FFT you need to identify which of the maxima corresponds with the fundamental frequency of the wave. For large autocorrelations (matching low frequencies), you can actually speed up the autocorrelation by doing it via the FFT. :) But I think for low numbers of samples, a "naive" implementation can be fast.
endolith
And the "fuzziness" is built-in. If you're looking for 100 Hz and the wave is 98 Hz, it will still match, just not as well.
endolith