views:

394

answers:

2

I'm interested in learning about, and writing a system that will extract features from audio files (mp3, wav, etc) which can later be used for whatever purpose. In the future I hope to use it to write some software for music similarity.

Are there any libraries that exist to help? I know of libxtract, but haven't used it.

Also, are there any low level c/c++ libraries that would be good with dealing with audio streams? I simply have no experience in this area.

Thanks for the help,

Eric

A: 

First, read about the FFT and digital signal processing. Next, get a textbook on speech recognition, since that's based on exactly what you want to do - a speech recognition engine extracts "features" from audio in order to determine what's being spoken.

I've found that Cepstral Coefficients make great "features" in the machine learning sense.

dmazzoni
+2  A: 

Marsyas is a very complete framework which also offers audio feature extraction.
It is written in C++ and offers a "patching" mechanism that allows you to plug together predefined components.
The framework comes with several examples.
Take a look at the sources to learn how to create custom extractors.
The bextract command line tool that comes with Marsyas can extract:

  • MFCCs
  • Zero Crossing Rate
  • Spectral Centroid
  • ...

Marsyas supports several platforms including Windows, Linux and Mac OS X (I also saw an article mentioning that it also works on the iPhone)

weichsel