views:

451

answers:

4

I'm trying to write unicode characters (♠) using System.out, and a question mark gets printed instead. I'm using IntelliJ on Windows, and trying to print within the IDE.

+2  A: 

System.out uses the default encoding of the underlying operating system which typically is ISO-8859-1 or UTF-8. The first is what I have with the 1252 codepage under XP.

Is this in a CMD.EXE window, or inside an IDE?

Thorbjørn Ravn Andersen
Inside IntelliJ, for starters.
Xenorose
+1  A: 

Is the file encoding configured correctly? See that "Settings | File Encodings" uses UTF-8. Printing ♠ works for me when I have IDE encoding and all files set to UTF-8. Recompiling may be needed after changing the encoding.

Esko Luontola
A: 

Do you have the appropriate fonts installed on your machine? The question mark appears when you don't have fonts for the characters you're outputting.

Also, are you outputting by doing a System.out.println("") ? If so, just installing the fonts should work.

If you are trying to write to System.out from within your program, that's different. You have to use an OutputStreamWriter, which is a character stream. You can't just write to a byte-oriented stream such as OutputStream.

Look up the API and class reference for OutputStreamWriter and subclasses such as PrintWriter. You construct it giving the locale of the constructor. For example,

PrintWriter pw = new PrintWriter(System.out, "UTF-8");

rhimbo
Encoding as UTF-8 is only useful if the receiver knows to decode the data as UTF-8. Following this advice as written will probably just result in more confusion.
McDowell
You always have to know the encoding with which to read data containing unicode characters. The original question indicated "unicode characters". To me that means it could contain arbitrary chars from any language. If you want to write such data, it must be in an encoding that supports ALL such possible characters in the input set. How else do you propose one do this????
rhimbo
A: 

If you ultimately want to print a wide range of Unicode characters on a standard command line on Windows, there is a bit of work involved. The default raster font will not support the characters and applications usually need to call the Unicode console API to render them. Java does not - it will first encode the characters to the native character set (a lossy process) and then emit them using an ANSI call. You can read this blog post if you want the gory details.

McDowell