views:

106

answers:

1

Hey, I am currently deleveloping a algorithm to decide wheather or not a frame is voiced or unvoiced. I am trying to use the Cepstrum to discriminate between these two situations. I use MATLAB for my implementation.

I have some problems, saying something generally about the frame, but my currently implementation looks like (I'm award of the MATLAB has the function rceps, but this haven't worked for either):

ceps = abs(ifft(log10(abs(fft(frame.*window')).^2+eps)));

Can anybody give me a small demo, that will convert the frame to the power cepstrum, so a single lollipop at the pitch frequency. For instance use this code to generate the frequency.

fs = 8000;
timelength = 25e-3;
freq = 500;
k = 0:1/fs:timelength-(1/fs);
s = 0.8*sin(2*pi*freq*k);

Thanks.

A: 

According to Wikipedia, the power cepstrum is (deep breath) the magnitude squared of the Fourier transform of the log of the magnitude squared of the Fourier transform of the signal. So I think you're looking for

function c = ceps(frame, win)
    c = abs(fft(log10(abs(fft(frame.*win)).^2+eps))).^2;

Note that I changed one of your variable names because WINDOW is a predefined function in the Signal Processing Toolbox.

But, ifft and fft only differ by a scale factor, and the outer abs won't change the overall shape, so where's the lollipop right? See further down on the Wikipedia page.

A sinusoidal time input isn't going to give you an impulse in the cepstrum. The sine should yield an impulse in the spectrum, which will still be an impulse after the logmag operation, which will transform into a level shift in the cepstrum. To get something impulsive in the cepstrum, you need something periodic in the spectrum, which means you need something with multiple harmonic frequencies in the time domain. Consider, for instance, a square wave:

N = 1024;
h = hann(N, 'periodic');
f = 10;
x = sin(2*pi*f*((1:N)'-1)/N); %#'# to deal with SO formatting
s = 2*(x > 0) - 1; %# square wave
cx = ceps(x, h);
cs = ceps(s, h);

cs will have your longed-for lollipop, not cx.

There seems to always be a large component in the 0th cepstral bin. I guess this is because the logarithm operation always makes the input to the second FFT have a big level shift? Also, I don't get the idea of quefrency, I would have expected the lollipop to be at N/f. So maybe there's still something wrong with this code, or (more likely) my understanding.

mtrw
Yes, I have seen Wikipedia, but most of the material I have found use the ifft instead of the fft. But yes, this isn't the main point.It gives perfectly sense, that the spectrum has to be periodic, if I want the peak, so thanks for the explanation. But I will try to build my demo up with a square-wave instead. Thanks!
CziX
@CziX - Glad to have helped! I'm still curious about the x-axis position, if you see the flaw in my understanding please let me know.
mtrw