views:

190

answers:

1

First of all i'm working on a little project to see the spectrum from some sounds.

I got this working with a microphone: alt text

The image above is just me talking and shouting through a microphone for a few seconds. This looks good to me.

But when I try to read an MP3 file and make a spectogram image of it it looks a bit different. I tried the Aphex Twin - Windowlicker where you should normally see a face in the spectrogram image or at least some more darker colors. But it doesn't look so good: alt text

Here is what I did with the microphone:

byte tempBuffer[] = new byte[10000];
ByteArrayOutputStream out = new ByteArrayOutputStream();
counter = 20;

// Microphone
while (counter != 0) {
 int count = line.read(tempBuffer, 0, tempBuffer.length);
 if (count > 0) {
  out.write(tempBuffer, 0, count);
 }
 counter--;
}
out.close();

// FFT code below ...
byte audio[] = out.toByteArray();
// ...

And this is how I do it with the MP3:

I used the same code to do the transformation and visualization only the audio capturing part is different (I only adjusted the hight in the drawing method to see if there is a difference but there wasn't one):

byte tempBuffer[] = new byte[10000];
ByteArrayOutputStream out = new ByteArrayOutputStream();
FileInputStream input = null;

File mp3 = new File("Aphex Twin - Widowlicker.mp3");
input = new FileInputStream(mp3);
int len;
while((len = input.read(tempBuffer)) > 0) {
 out.write(tempBuffer, 0, len);
}

out.close();
input.close();

// FFT code below ...
byte audio[] = out.toByteArray();
// ...

It would be nice if somebody could point me out what I am doing wrong with the MP3 file.

These are my settings:

  • Sample rate: 44100
  • Bit per sample: 8
  • Channels: 1 (mono)
  • signed: true
  • big endian: true (i'm using AudioFormat in Java)
  • tempBuffer to read audio: 10000 ( byte tempBuffer[] = new byte[10000]; )
  • and for the FFT I split the audio in chuncks of 4096 (must be a power of 2)

By the way: are these settings ok or should I use 16bps or stereo or is 10000 for the buffer too much or 4096 to small/big ?

Thanks in advance

+1  A: 

MP3 is a compressed audio format. You should first decompress the data before you can use it as an audio stream comparable to the data from your microphone. The raw MP3 data has maximum entropy and should look much like white noise, which it does in you spectrogram.

Han
I'm always getting an GC OutOfMemoryException. Is my buffer (10000) too big (did some research and most people take 10000).I used the MP3 SPI library to decode the mp3 (it's working now but still having too much data hmm)
juFo