Hello, I want to use either sphinx4 or the HTK toolkit to build me a speech recognition application that aims to estimate ones age from voice. I understand, to a greater extent, the ststistical models involved in speech recognition. I am interested in Mel frequency cepstral coefficients and Gausian mixture models because these two are better suited to my problem domain. Do I have to use neural networks and feed in the training data from the vectors derived from the sphinx classifiers ? I am not quite sure where to start with sphinx or the HTK toolkit. I am new to sphinx and speech recognition and my application is only a prototype.
Can anyone please offer some form of guidance in this regard. Kind regards.