views:

1585

answers:

4

Hi, I'm working on an application that processes audio data. I'm using java (I've added MP3SPI, Jlayer, and Tritonus). I'm extracting the audio data from a .wav file to a byte array. The audio data samples I'm working with are 16 bits stereo.

According to what I've read the format for one sample is:

AABBCCDD

where AABB represents left channel and CCDD rigth channel (2 bytes for each channel). I'd need to convert this sample into a double value type. I've reading about data format. Java uses Big endian, .wav files use little endian. I'm a little bit confused. Could you please help me with the conversion process? Thanks you all

A: 

Little Endian means that the data is in the form BBAA and DDCC. You would just swap it around.

From the beginning of the frame:

int left = (bytes[i+1] << 8) + bytes[i];
int right = (bytes[i+3] << 8) + bytes[i+2];

where i is your the index of your sample.

CookieOfFortune
I'd personally put brackets around the shifts - I seem to remember the precedence can be a little odd...
Jon Skeet
Yeah, +- has higher precendence than shifts... edited.
CookieOfFortune
This lacks proper masking of the low order bits.
erickson
+2  A: 

Warning: integers and bytes are signed. Maybe you need to mask the low bytes when packing them together:

for (int i =0; i < length; i += 4) {
    double left = (double)((bytes [i] & 0xff) | (bytes[i + 1] << 8));
    double right = (double)((bytes [i + 2] & 0xff) | (bytes[i + 3] << 8));

    ... your code here ...

}
G B
Good point, but you need to do it on all of your byte[] accesses (unless they are shifted left 24 bits or more). They are promoted to ints, so 0x80 becomes 0xffffff80
erickson
That's exactly what I wanted, preserve the sign of the high bytes.Actually, I didn't test it (and I don't know if it works, particularly with floating-point values), but a signed 16-bit value should not lose its sign when promoted to 32-bit or more.
G B
Hi, thanks for your answers.@erickson; I don't really understand your comment.@GB; by doing this I'm getting in left the value for each sample in the left channel and in right the ones for the right channel.So, at the end I'll have 2 values for each sample, no? Is it possible to get just one value for each sample?
dedalo
@dedalo: it depends on what you are trying to do with these samples. If you want a single 32-bit value you need to mask and shift all 4 byte values. Or do you want to average them?
G B
@GB: what I'm trying to do is to calculate the Mel coefficients (MFCC) of different audio files. These coefficients are used in speeche recognition. In order to calculate the coefficients the first steps are to get the samples from the audio file, apply a hamming window and calculate FFT.
dedalo
erickson
Is there any way to see the actual bits for each position in the array? Maybe that will help understand this negative/positive thing.
dedalo
dedalo
Short answer: I don't know, I assumed samples are signed.Example: little endian representation of 1027 is "0x0304", you need to convert it to "0x00000403", representation of -1234 is "0x2EFB", you need to convert it to "0xFFFFFB2E". If you "cut" a negative number, and convert it to "0x0000FB2E", casting to double will yield a wrong positive value. You only want to mask the low-order byte, if you are dealing with signed values.
G B
I'm still confused. The samples I'm working with are 16 bits and signed, little endian. If I have bytes AABB in the samples array (bytes 0 and 1) I think what I need to do is to exchange the order of the bytes (Java is big endian) and then get the value of the sample (BBAA) in a double type variable. If this is right, is the sign of BB (high-order byte) the one we want to protect?
dedalo
Yes. Your samples are automatically cast to 32-bit int values before they are converted to double. You want to preserve their sign, therefore you only mask the low-byte, and shift the high-byte 8 bits to the left (that is: multiply it by 256, without changing its sign). The problem is: there is no 16-bit value, only an "automatic" 8-to-32 bit cast. AABB becomes XXXXBBAA (where X is either 0 or F)
G B
If I wanted to get one average value for 1 sample, usinf both channels, how could I do it? Would I have the same problem?
dedalo
A: 

I would personally look for a library that does the endian swapping for you. Each audio file format has assumptions about the endianness for you and getting this right is tricky for all the bit depths/datatypes wave files support:

  • 8bit - uint8
  • 16bit - int16
  • 24bit - int32
  • 32bit - int32 as float
  • 32bit - float
  • 64bit - double

If you want to support most common types of wave files you'll need endian conversions for all of these datatypes.

I would look at ByteSwapper, which will give you byteswapping for most of the types listed above.

Its too bad Java doesn't have an endianness field in their File IO classes. Being able to simply open a file whos edianness is big or little is a much easier solution to this issue.

Nick Haddad
A: 

When you use the ByteBuffer (java.nio.ByteBuffer) you can use the method order;

[order]

public final ByteBuffer order(ByteOrder bo)

Modifies this buffer's byte order.

Parameters:
    bo - The new byte order, either BIG_ENDIAN or LITTLE_ENDIAN
Returns:
    This buffer

After this you can get the above mentioned values with;

getChar() getShort() getInt() getFloat() getDouble()

What a great language is Java ;-)

Roland Beuker