views:

684

answers:

3

Hi,

I'm working on an application that has to proccess audio files. When using mp3 files I'm not sure how to handle data (the data I'm interested in are the the audio bytes, the ones that represent what we hear).

If I'm using a wav file I know I have a 44 bytes header and then the data. When it comes to an mp3, I've read that they are composed by frames, each frame containing a header and audio data. Is it posible to get all the audio data from a mp3 file?

I'm using java (I've added MP3SPI, Jlayer, and Tritonus) and I'm able to get the bytes from the file, but I'm not sure about what these bytes represent or how to handle then.

+7  A: 

From the documentation for MP3SPI:

File file = new File(filename);
AudioInputStream in= AudioSystem.getAudioInputStream(file);
AudioInputStream din = null;
AudioFormat baseFormat = in.getFormat();
AudioFormat decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 
                                            baseFormat.getSampleRate(),
                                            16,
                                            baseFormat.getChannels(),
                                            baseFormat.getChannels() * 2,
                                            baseFormat.getSampleRate(),
                                            false);
din = AudioSystem.getAudioInputStream(decodedFormat, in);

You then just read data from din - it will be the "raw" data as per decodedFormat. (See the docs for AudioFormat for more information.)

(Note that this sample code doesn't close the stream or anything like that - use appropriate try/finally blocks as normal.)

Jon Skeet
Hi Jon,Thanks for your quick answer!In your proposal; is 'decodedFormat' a representation of the mp3 data decoded in other format? if I write "din.read()", am I getting the data bytes in th decoded format?Thanks
dedalo
Yes. That decodedFormat says "I want you to decode as signed PCM data".
Jon Skeet
Hi.I followed your advise and it worked. To visualize the data I use:while ((numBytesRead = din.read(audioBytes)) != -1) {}This reads the bytes in 'din' and stores them in the array audioBytes. I've trying visualizinf the data by using:while ((numBytesRead = din.read(audioBytes)) != -1) {System.out.println("Bytes Decoded value " + audioBytes[0]);}I have a question about this data:Every sample uses 16 bits, that's 2 positions in the array audioBytes, rigth? How could I get the value of every sample? Does the decoded format (wav) has the 44 header bytes?Thank yoy very much for your help!
dedalo
The decoded format here isn't actually wav - it's just the data part. Yes, you'll get one sample per two bytes (and two samples for the same time if it's stereo). Just fetch both bytes and convert each pair into a 16-bit value. Or if you want, you could change the 16 to 8 in the decodedFormat constructor call...
Jon Skeet
I'm a little confused. If the audio file is stereo does this mean that in the byte array there are 2 bytes for the 1st sample (left channel) and another 2 bytes for the 1st sample (rigth channel)?
dedalo
Yes. (I'm not sure which order they're in, whether it's left then right or right then left, but that's the basic idea.)
Jon Skeet
Ok, I've checked the order of the bytes in the case the audio file is stereo. I think this is a problem. After getting the audio data sample array I need to start processing it, which involves applying a Hamming window to N samples and then calculating FFT. I'll keep thinking about it. Thanks!
dedalo
@dedalo: If you only *want* it to decode to mono, change the decodedFormat constructor call. I don't know whether it will pick one channel or other, or mix the two, but it's worth a try.
Jon Skeet
This is waht I tried:[AudioFormat decodedFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, baseFormat.getSampleRate(),16, 1, //baseFormat.getChannels(), baseFormat.getChannels() * 2, baseFormat.getSampleRate(), false);But I got an exception I was not able to solve.
dedalo
You probably need to change the next argument as well (the frame size). If that doesn't work, please say what the exception was.
Jon Skeet
I tried changing the frame size argument but none of the values I used worked. It looks like it lets me modify the other arguments, but not the one related to stereo/mono. I think the exception is caused by: numBytesRead = din.read(audioBytes).
dedalo
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1at javazoom.spi.mpeg.sampled.convert.DecodedMpegAudioInputStream$DMAISObuffer.append(Unknown Source)at javazoom.jl.decoder.Obuffer.appendSamples(Unknown Source)at javazoom.jl.decoder.SynthesisFilter.compute_pcm_samples(Unknown Source)at javazoom.jl.decoder.SynthesisFilter.calculate_pcm_samples(Unknown Source)at javazoom.jl.decoder.LayerIIIDecoder.decode(Unknown Source)at javazoom.jl.decoder.LayerIIIDecoder.decodeFrame(Unknown Source)
dedalo
at javazoom.jl.decoder.Decoder.decodeFrame(Unknown Source)at javazoom.spi.mpeg.sampled.convert.DecodedMpegAudioInputStream.execute(Unknown Source)at org.tritonus.share.TCircularBuffer.read(TCircularBuffer.java:138)at org.tritonus.share.sampled.convert.TAsynchronousFilteredAudioInputStream.read(TAsynchronousFilteredAudioInputStream.java:189)at org.tritonus.share.sampled.convert.TAsynchronousFilteredAudioInputStream.read(TAsynchronousFilteredAudioInputStream.java:175)
dedalo
Okay, well you'd probably want to debug into that. You may need to handle the stereo to mono conversion yourself.
Jon Skeet
one last question:when gettin the decodedFormat we're choosing little endian (big endian = false). If I write 'true', will the data in decodedFormat be stored in big endian format? If so I won't need to manipulate the bytes in order to get a double value type for each sample.
dedalo
At the end of the process I'll get a number of arrays containing the mel coefficients, is it possible to use a k-mean algorithm?
dedalo
@dedalo: In terms of big-endianness: Don't know off hand, but I'd expect so. As for k-mean... no idea, not knowing what it is.
Jon Skeet
A: 

The data that you want are the actual samples, while MP3 represents the data differently. So, like what everyone else has said - you need a library to decode the MP3 data into actual samples for your purpose.

sybreon
A: 

As mentioned in the other answers, you need a decoder to decode MP3 into regular audio samples.

One popular option would be JavaLayer (LGPL).

sleske