First of all i'm working on a little project to see the spectrum from some sounds.
I got this working with a microphone:
The image above is just me talking and shouting through a microphone for a few seconds. This looks good to me.
But when I try to read an MP3 file and make a spectogram image of it it looks a bit different. I tried the Aphex Twin - Windowlicker where you should normally see a face in the spectrogram image or at least some more darker colors. But it doesn't look so good:
Here is what I did with the microphone:
byte tempBuffer[] = new byte[10000];
ByteArrayOutputStream out = new ByteArrayOutputStream();
counter = 20;
// Microphone
while (counter != 0) {
int count = line.read(tempBuffer, 0, tempBuffer.length);
if (count > 0) {
out.write(tempBuffer, 0, count);
}
counter--;
}
out.close();
// FFT code below ...
byte audio[] = out.toByteArray();
// ...
And this is how I do it with the MP3:
I used the same code to do the transformation and visualization only the audio capturing part is different (I only adjusted the hight in the drawing method to see if there is a difference but there wasn't one):
byte tempBuffer[] = new byte[10000];
ByteArrayOutputStream out = new ByteArrayOutputStream();
FileInputStream input = null;
File mp3 = new File("Aphex Twin - Widowlicker.mp3");
input = new FileInputStream(mp3);
int len;
while((len = input.read(tempBuffer)) > 0) {
out.write(tempBuffer, 0, len);
}
out.close();
input.close();
// FFT code below ...
byte audio[] = out.toByteArray();
// ...
It would be nice if somebody could point me out what I am doing wrong with the MP3 file.
These are my settings:
- Sample rate: 44100
- Bit per sample: 8
- Channels: 1 (mono)
- signed: true
- big endian: true (i'm using AudioFormat in Java)
- tempBuffer to read audio: 10000 ( byte tempBuffer[] = new byte[10000]; )
- and for the FFT I split the audio in chuncks of 4096 (must be a power of 2)
By the way: are these settings ok or should I use 16bps or stereo or is 10000 for the buffer too much or 4096 to small/big ?
Thanks in advance