I want to build a speech recognition engine in ruby. I know i'll never get there, doing it just for fun. I need to get data for the frequencies of the sound stored in a wav file to compare with data i already have of different sounds that i want to recognize. I will write the code in ruby but i dont think there are any libraries for this written in ruby, they would be too slow if there were any anyway. The good thing about ruby is I'll be able to use libraries for .net via IronRuby or Java via Jruby. How can i get the frequency data?
+2
A:
A wave file is not too complicated, in essence it is just a series of audio samples: http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html.
Once you can read the samples, next step would be to run them through a FFT transformation, in order to get the frequency content. There should be some open source implementation you can use, or you could implement one yourself.
What you are trying to do require some understanding of audio and the mathematics behind signal processing, so perhaps you would want to start with a book on the subject.
driis
2010-04-24 10:58:33
+1, specially for the last advice
leonbloy
2010-04-25 02:26:23