views:

78

answers:

2
+2  Q: 

graphing amplitude

I was wondering if someone could point me to a good tutorial or show me how to graph the amplitude from a byte array. The audio format I am using is: U LAW 8000.0 Hz, 8 bit, mono, 1 bytes/frame.

A: 

Read about Fourier transform. But it's only a part of all you need to do in order to achieve what you want.

Roman
Poor answer - it doesn't really tell the guy *anything* about graphing amplitude.
Paul R
@Paul R: when I needed to do something similar I had to read article of about 50 pages only to understand the principle. It's not an easy problem.
Roman
I know what you are saying, I have done research on that and the dft and the fft. It is fun problem.
John
It doesn't look like he is interested in frequency domain amplitude information, so the FFT is somewhat irrelevant.
Paul R
A: 

It sounds like you are interested in a short term smoothed RMS amplitude measurement. Usually to do this you take a rectified version of the input signal, and then apply a low pass filter to this, e.g.

x1 = abs(x); // x2 = rectified input signal
x2 = k * x2 + (1 - k) * x1; // simple single pole low pass recursive filter

x2 is the amplitude of the signal x. k is a factor < 1.0 which determines the time constant of the smoothing filter.

You will then have some kind of threshold value which you use to decide whether you are in silence (x2 < threshold) or speech (x2 >= threshold).

Paul R
yes this is what I am looking for, thank you. When you say: x2 is the amplitude of the signal x; what is 'x' (sorry I have been working on this for far to an extended period of time). Also, is there a good way to calculate what value k should have or is there commonly known values?
John
x is the input value at the current sample time (you can consider your stream of input data to be an array x[] if that helps). Typically k will be between 0.9 and 0.99 but you will want to experiment with this and the threshold etc to get the behaviour you want in terms of how quickly you switch between "silence" and "speech", how many false positives/negatives you want, etc.
Paul R
Once again thank you this helps a a lot. Do I need to do anything differently because of the encoding.
John
@john: you'll need to convert your u-law samples to linear before processing, but this is pretty trivial to do.
Paul R
When you say linear you mean PCM? Also do you know where I can find a good tutorial on doing the conversion from ulaw to pcm. I have looked but I have not found anything explaining how to do this.
John
Yes, you just need to convert the 8 bit µ-law samples to 16 bit (linear) signed integers. If you look at the Wikipedia entry for µ-law: <http://en.wikipedia.org/wiki/Μ-law_algorithm> and scroll to the bottom, the last link takes you to example C code for µ-law coding/decoding: <http://hazelware.luggle.com/tutorials/mulawcompression.html>
Paul R
You are a life saver. Thank you so much. That is exactly what I needed
John