views:

611

answers:

1

I want to get the timbre of some audio.

To use that I will make the Mel Frequency Cepstrum Coefficients algorithm.

The implementation looks simples (I allready made step 1): 1. Take the Fourier transform of (a windowed excerpt of) a signal. 2. Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows. 3. Take the logs of the powers at each of the mel frequencies. 4. Take the discrete cosine transform of the list of mel log powers, as if it were a signal. 5. The MFCCs are the amplitudes of the resulting spectrum.

In step 2 I know how to pass from frequency to mel scale but I don't know what that triangular overlapping windows means..

How do I do this step correctly? What does triangular overlapping windows mean?

A: 

Once you've done the conversion to the mel scale, apply a set of overlapping triangular filters spaced evenly along this scale (and therefore more closely spaced for the low frequencies). That is, here you're going from the roughly continuous curve returned by the FFT to a set a discrete 20-50 discrete values.

I googled around for a pictures of the filters, and found a few (both in pdfs), here and here (p. 4). These also describe at some length other details of how they do the calculations.

tom10