views:

441

answers:

4

So, I've been working on a little visualizer for sound files, just for fun. I basically wanted to imitate the "Scope" and "Ocean Mist" visualizers in Windows Media Player. Scope was easy enough, but I'm having problems with Ocean Mist. I'm pretty sure that it is some kind of frequency spectrum, but when I do an FFT on my waveform data, I'm not getting the data that corresponds to what Ocean Mist displays. The spectrum actually looks correct, so I knew there was nothing wrong with the FFT. I'm assuming that the visualizer runs the spectrum through some kind of filter, but I have no idea what it might be. Any ideas?

EDIT2: I posted an edited version of my code here. By edited, I mean that I removed all the experimental comments everywhere, and left only the active code. I also added some descriptive comments. The visualizer now looks like this.

EDIT: Here are images. The first is my visualizer, and the second is Ocean Mist.

my visualizer ocean mist

+5  A: 

Here's some Octave code that shows what I think should happen. I hope the syntax is self-explanatory:

%# First generate some test data
%# make a time domain waveform of sin + low level noise
N = 1024;
x = sin(2*pi*200.5*((0:1:(N-1))')/N) + 0.01*randn(N,1);

%# Now do the processing the way the visualizer should
%# first apply Hann window = 0.5*(1+cos)
xw = x.*hann(N, 'periodic');
%# Calculate FFT.  Octave returns double sided spectrum
Sw = fft(xw);
%# Calculate the magnitude of the first half of the spectrum
Sw = abs(Sw(1:(1+N/2))); %# abs is sqrt(real^2 + imag^2)

%# For comparison, also calculate the unwindowed spectrum
Sx = fft(x)
Sx = abs(Sx(1:(1+N/2)));

subplot(2,1,1);
plot([Sx Sw]); %# linear axes, blue is unwindowed version
subplot(2,1,2);
loglog([Sx Sw]); %# both axes logarithmic

which results in the following graph: top: regular spectral plot, bottom: loglog spectral plot (blue is unwindowed)

I'm letting Octave handle the scaling from linear to log x and y axes. Do you get something similar for a simple waveform like a sine wave?

OLD ANSWER

I'm not familiar with the visualizer you mention, but in general:

  • Spectra are often displayed using a log y-axis (or colormap for spectrograms).
  • Your FFT might be returning a double-sided spectrum, but you probably want to use only the first half (looks like you're doing already).
  • Applying a window function to your time data makes the spectral peaks narrower by reducing leakage (looks like you're doing this too).
  • You might need to divide by the transform blocksize if you're concerned with absolute magnitudes (I guess not important in your case).
  • It looks like the Ocean Mist visualizer is using a log x-axis too. It might also be smoothing adjacent frequency bins in sets or something.
mtrw
I assume you mean log y-axis there, or is there a distinction? How would I implement it?
Bevin
+1 for noting that both the x and y axis are logarithmic. The log-x aspect explains why the first narrow peak in the top plot is stretched to about 1/3 of the view in the lower plot. The log-y scaling explains why the variation between the peaks and the average values are compressed in the lower plot.
the_mandrill
@Bevin - Both axes are logarithmic. I usually use Octave (a Matlab clone) for graphing, so I have to confess I'm not that good at mapping data to pixels myself. If you have a plotting library, look for `loglog` plotting (see http://en.wikipedia.org/wiki/Logarithmic_scale#Log-log_plots). If you're doing it yourself, make the display height proportional to log(spectrum amplitude), as @Paul R suggested. Then make display width proportional to log(freq/FMin), where FMin is the lowest frequency you want to display. I suggest 20 Hz to start with, but a higher number might look better.
mtrw
@mtrw - Well, I (think I) implemented what you said, and it ended up like this: http://i41.tinypic.com/28jslj.jpg Not really what I expected. I might have screwed up though.
Bevin
@Bevin - that definitely doesn't look right. Give me a few minutes, I'll make some graphs of what I think should happen.
mtrw
Well, there's a clear difference between your graph and mine. Perhaps I can post my code and you can take a look? Not the FFT or anything, just the code that does the actual calculations and plotting.
Bevin
@Bevin, sure go ahead. I'm going to be off-line for a couple of hours, but if you don't mind the delay I'd be happy to take a look, or maybe someone else will spot the issue.
mtrw
Well, I posted it. The link is in the post at the top.
Bevin
+2  A: 

Normally for this kind of thing you want to convert your FFT output to a power spectrum, usually with a log (dB) amplitude scale, e.g. for a given output bin:

p = 10.0 * log10 (re * re + im * im);

Paul R
Do I have to normalize this "p"? Like, dividing it by n/2 afterward?
Bevin
It's a dB value - you can add or subtract a suitable dB offset to get it into whatever range you want. You can then convert this dB value to screen coordinates or pixel intensity or whatever is appropriate for your visualizer.
Paul R
Well, I tried using your formula, and it came across as kind of noisy. Here, take a look: http://i39.tinypic.com/15eig3s.jpg
Bevin
In order to test your implementation you want to start with a simple signal with a known spectrum. Start with e.g. a single pure tone (sine wave) at say 1 kHz and see what that looks like - you should just get a single large peak. If not then you're doing something wrong with your FFT and/or plotting code.
Paul R
I get this: http://i40.tinypic.com/23jijr7.jpgIs that what you mean?
Bevin
@Bevin - @Paul R's suggestion for taking the log of the squared amplitude is right on. Looking at your second picture, it looks like you need to add a window. Multiply your time domain data by a function of the form 0.5*(1 - cos(2*pi*n/N)), where N is your transform blocksize. See http://en.wikipedia.org/wiki/Window_function for background.
mtrw
+1  A: 

It definitely looks like the ocean mist Y-Axis is logarithmic.

AShelly
So, how would I implement a Y-log scale? Use the log(absolute magnitude) as the y-value?
Bevin
+1  A: 

It seems to that not only the y axis, but the x axis also is logarithmic. The distance between peaks seems to lower at higher frequencies.

Giuseppe Guerrini