ansaurus

Question

Why does the string "¿" get translated to "Â¿" when calling .getBytes()

Answer 1

+1 A:

Sounds like the system console isn't in UTF-8

Greg 2008-10-06 20:36:05

Answer 2

+2 A:

You need to specify the charset in the String constructor (see the API docs [can't link directly to the constructor, since the anchor seems to mess up SO's markup system]).

Michael Myers 2008-10-06 20:41:07

Answer 3

+5 A:

You don't have to use utf-16 to solve this:

new String("¿".getBytes("utf-8"), "utf-8");

works just fine; as long als the enconfig given to the getBytes method is the same as the encoding you pass to the constructor of string you should be fine!

p3t0r 2008-10-06 20:55:52

That's exactly what I said, plus I linked to the API docs. ;)

Michael Myers 2008-10-06 21:05:06

But p3t0r provided a code example. ;)

Erik Forbes 2008-10-06 21:39:24

mmyers, don't worry ive upvoted you :)

Johannes Schaub - litb 2008-11-13 16:31:31

Answer 4

+1 A:

Try:

System.out.println(new String("¿".getBytes("UTF-8"), "UTF-8"));

You need to specify the encoding both when converting the string to bytes and when converting the bytes back to a string.

John Meagher 2008-10-06 20:56:48

Answer 5

+2 A:

See my reply to your other thread. But this:

new String("¿".getBytes("utf-8"), "utf-8");

...is junk code, completely pointless. It's just an expensive way to create a duplicate of the original string.

Alan Moore 2008-10-06 23:57:29

Unless the point is to understand how to make the conversion symmetrical, I guess.

McDowell 2008-10-07 10:26:02

ansaurus

tags:

views:

answers:

Why does the string "¿" get translated to "Â¿" when calling .getBytes()

related questions