views:

488

answers:

5

When writing the string "¿" out using

System.out.println(new String("¿".getBytes("UTF-8")));

¿ is written instead of just ¿.

WHY? And how do we fix it?

+1  A: 

Sounds like the system console isn't in UTF-8

Greg
+2  A: 

You need to specify the charset in the String constructor (see the API docs [can't link directly to the constructor, since the anchor seems to mess up SO's markup system]).

Michael Myers
+5  A: 

You don't have to use utf-16 to solve this:

new String("¿".getBytes("utf-8"), "utf-8");

works just fine; as long als the enconfig given to the getBytes method is the same as the encoding you pass to the constructor of string you should be fine!

p3t0r
That's exactly what I said, plus I linked to the API docs. ;)
Michael Myers
But p3t0r provided a code example. ;)
Erik Forbes
mmyers, don't worry ive upvoted you :)
Johannes Schaub - litb
+1  A: 

Try:

System.out.println(new String("¿".getBytes("UTF-8"), "UTF-8"));

You need to specify the encoding both when converting the string to bytes and when converting the bytes back to a string.

John Meagher
+2  A: 

See my reply to your other thread. But this:

new String("¿".getBytes("utf-8"), "utf-8");

...is junk code, completely pointless. It's just an expensive way to create a duplicate of the original string.

Alan Moore
Unless the point is to understand how to make the conversion symmetrical, I guess.
McDowell