views:

463

answers:

4

I have a program that plots the spectrum analysis (Amp/Freq) of a signal, which is preety much the DFT converted to polar. However, this is not exactly the sort of graph that, say, winamp (right at the top-left corner), or effectively any other audio software plots. I am not really sure what is this sort of graph called (if it has a distinct name at all), so I am not sure what to look for.

I am preety positive about the frequency axis being base two exponential, the amplitude axis puzzles me though.

Any pointers?

A: 

Well I'm not 100% sure what you mean but surely its just bucketing the data from an FFT?

If you want to get the data such that you have (for a 44Khz file) frequency points at 22Khz, 11Khz 5.5Khz etc then you could use a wavelet decomposition, i guess ...

This thread may help ya a bit ...

http://stackoverflow.com/questions/1679974/converting-an-fft-to-a-spectogram

Same sort of information as a spectrogram I'd guess ...

Goz
A: 

To generate a power spectrum you need to do the following steps:

  • apply window function to time domain data (e.g. Hanning window)
  • compute FFT
  • calculate log of FFT bin magnitudes for N/2 points of FFT (typically 10 * log10(re * re + im * im))

This gives log magnitude (i.e. dB) versus linear frequency.

If you also want a log frequency scale then you will need to accumulate the magnitude from appropriate ranges of bins (and you will need a fairly large FFT to start with).

Paul R
+1  A: 

Actually an interesting question. I know what you are saying; the frequency axis is certainly logarithmic. But what about the amplitude? In response to another poster, the amplitude can't simply be in units of dB alone, because dB has no concept of zero. This introduces the idea of quantization error, SNR, and dynamic range.

Assume that the received digitized (i.e., discrete time and discrete amplitude) time-domain signal, x[n], is equal to s[n] + e[n], where s[n] is the transmitted discrete-time signal (i.e., continuous amplitude) and e[n] is the quantization error. Suppose x[n] is represented with b bits, and for simplicity, takes values in [0,1). Then the maximum peak-to-peak amplitude of e[n] is one quantization level, i.e., 2^{-b}.

The dynamic range is the defined to be, in decibels, 20 log10 (max peak-to-peak |s[n]|)/(max peak-to-peak |e[n]|) = 20 log10 1/(2^{-b}) = 20b log10 2 = 6.02b dB. For 16-bit audio, the dynamic range is 96 dB. For 8-bit audio, the dynamic range is 48 dB.

So how might Winamp plot amplitude? My guesses:

  1. The minimum amplitude is assumed to be -6.02b dB, and the maximum amplitude is 0 dB. Visually, Winamp draws the window with these thresholds in mind.

  2. Another nonlinear map, such as log(1+X), is used. This function is always nonnegative, and when X is large, it approximates log(X).

Any other experts out there who know? Let me know what you think. I'm interested, too, exactly how this is implemented.

Steve
A: 

What you need is power spectrum graph. You have to compute DFT of your signal's current window. Then square each value.

psihodelia