views:

681

answers:

1

I wanna make the Mel-Frequency Cepstrum Algorithm but there are some things that I don't understand.

After FTT is done we need to "Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows."

I know how to calculate the triangles and I also know how to pass to mel scale. I simply don't know what to do with them.

If the triangles are defined, how do I map the power of the spectrum obtained above onto the mel scale?

Is it like this: Sum the frequencies inside the triangle and then pass it to mel scale? or Sum the frequencies inside the triangle according to a weight value (defined by the height of the triangle at that point) and then pass it to mel scale? or Pass all the frequencies inside the triangle to mel scale according to the weith value? Another thing?

Can anyone clarifies this to me

+1  A: 

I think this step of the process is a little weird and doesn't make complete sense (to me anyway). The center of the filter bands are equally spaced along the mel scale, but are triangles on the linear scale, i.e. just like the figure here.

Then calculate the weighted sum using these triangle along the linear x-axis. (In this previous step, I think that some approaches normalize by the filter-triangle's area, and some don't, and I'm honestly not sure about the final consequences here, though I suspect it may not mean much except to modify the final interpretation which are all relative comparisons anyway. One maintains total energy, and the other give equally weighted contributions per band.) Then take the log of this (which converts the overall volume factor to an offset).

Edit: To be more clear on applying the filters... Each triangle represents a separate filter, producing a separate weighted sum. If there twenty filters in your filter bank, there will be twenty triangles, and twenty weighted sums to calculate. To apply each filter, for each x-axis value multiple the filter value at that x-location by the function value at that x-location, and add this to the sum for that particular filter. Most x-axis values with have two filters that are present there, so at each x-location makes a contribution to two filters.

tom10
I edited my answer to address your question, I hope. If it doesn't, please restate your question very carefully so I understand what's unclear.
tom10
Also, you can probably still accept my answer, even without 15 points. It's not like I'm dying to get the points here, it's honestly more to get you engaged, but see this... http://meta.stackoverflow.com/questions/8396/how-do-i-accept-an-answer-where-do-i-click But don't accept this for about a day anyway, so more people will see the question and maybe someone will have something illuminating to say.
tom10
Thanks again!Offcourse I accept your answer.You have been great to me.The final doubt is regarding the conversion from frequency to mel scale.This is made by applying the filters and the sum of weights?orI need to do something like this:http://en.wikipedia.org/wiki/Mel_scaleanywhere?
aF
My understanding is that the Mel scale is just to determine the spacing of the filters in the filter-bank. Once you have this spacing, the triangle-filters and the weighted sums that they lead to are done along the linear scale.
tom10
Yes, now is that what I think of it.And with more then 15 points there is the vote. You earned it :PThanks a lot!! :d
aF
Andre - I'm glad to help, and I hope you do something interesting with the MFCCs. Good luck with it.
tom10