I want to detect not the pitch, but the pitch class of a sung note.
So, whether it is C4 or C5 is not important: they must both be detected as C.
Imagine the 12 semitones arranged on a clock face, with the needle pointing to the pitch class. That's what I'm after! ideally I would like to be able to tell whether the sung note is spot-on or slightly off.
This is not a duplicate of previously asked questions, as it introduces the constraints that:
the sound source is a single human voice, hopefully with negligible background interference (although I may need to deal with this)
the octave is not important, only the pitch class
I am contemplating first smoothing the microphone input signal, something like
ySmoothedNew = ySmoothedLast * 0.9 + newY * 0.1; ySmoothedLast = ySmoothedNew;
then calculating zeros. of course I expect each wave to comprise several crossings, but provided each wave contains the same number of crossings, it shouldn't be that hard to figure out the periodicity.
But I feel sure I'm reinventing the wheel. Before I get sunk in a mass of floats, can anyone help steer me in a sensible direction?
PS I will be very grateful if anyone can point me to some simple iPhone wrapper code that exposes the microphone byte stream.