ansaurus

Question

Answer 1

+10 A:

May be the first two bytes are the Byte Order Mark (http://en.wikipedia.org/wiki/Byte_Order_Mark). It specifies the order of bytes in each 16-bit word used in the encoding.

Alexander 2008-10-23 08:50:31

Answer 2

+3 A:

Try printing out the bytes in hex to see where the extra 2 bytes are added - are they at the start or end?

I'm picking that you'll find a byte order marker at the start (0xFEFF) - this allows anyone consuming (receiving) the byte array to recognise whether the encoding is little-endian or big-endian.

Bevan 2008-10-23 08:52:42

Answer 3

+15 A:

Alexander's answer explains why it's there, but not how to get rid of it. You simply need to specify the endianness you want in the encoding name:

String source = "0123456789";
byte[] byteArray = source.getBytes("UTF-16LE"); // Or UTF-16BE

Jon Skeet 2008-10-23 08:53:15

Answer 4

+1 A:

UTF has a byte order marker at the beginning that tells that this stream is encoded in a particular format. As the other users have pointed out, the
1st byte is 0XFE
2nd byte is 0XFF
the remaining bytes are
0
48
0
49
0
50
0
51
0
52
0
53
0
54
0
55
0
56
0
57

anjanb 2008-10-23 08:59:48

ansaurus

tags:

views:

answers:

convert string to byte[] in java

related questions