views:

10261

answers:

4

I'm currently working on a Java project that is emitting the following warning when I compile:

/src/com/myco/apps/AppDBCore.java:439: warning: unmappable character for encoding UTF8
    [javac]      String copyright = "� 2003-2008 My Company. All rights reserved.";

I'm not sure how SO will render the character before the date, but it should be a copyright symbol, and is displayed in the warning as a question mark in a diamond.

It's worth noting that the character appears in the output artifact correctly, but the warnings are a nuisance and the file containing this class may one day be touched by a text editor that saves the encoding incorrectly...

How can I inject this character into the "copyright" string so that the compiler is happy, and the symbol is preserved in the file without potential re-encoding issues?

+10  A: 

Use the "\uxxxx" escape format.

According to Wikipedia, the copyright symbol is unicode U+00A9 so your line should read:

String copyright = "\u00a9 2003-2008 My Company. All rights reserved.";
Jon Skeet
Be careful with \uNNNN characters... they are parsed before doing lexical analysis. For example, if you put this comment /* c:\unit */ to your code, it will not compile anymore, because "nit" isn't correct hex number.
Peter Štibraný
Absolutely. (This is better handled in C#, where unicode escaping is only applied in certain contexts - but then there's the dangerous \x escape sequence as well, which is awful.)
Jon Skeet
Excellent, the escape format worked great. Thanks Jon!
seanhodges
This sounds more like a band-aid than a cure. The real problem appears to be that you're telling javac to expect source files in UTF-8 when they're really in a single-byte encoding like ISO-8859-1 or windows-1252.
Alan Moore
@Alan M: In my experience, it's a lot easier to make sure you won't have a problem by keeping source files in ASCII than it is to make sure you use the right encoding *everywhere* your source might be compiled (Ant, Eclipse, IDEA etc).
Jon Skeet
The old Seven-bit Solution, sure. But character-encoding problems crop up in many other contexts, too. Every developer has to have a pretty good grasp of the issues involved.
Alan Moore
@Alan: Yes, every developer should know about encodings etc. That doesn't mean it's a good idea to cause problems where you don't have to. I prefer portable code where you don't *have* to choose the encoding (as almost everything is ASCII-friendly).
Jon Skeet
+7  A: 

Try with: javac -encoding ISO-8859-1 file_name.java

Fernando Nah
I like this solution. I added "-encoding UTF-8" as a compilerarg in my ant build.xml and I still get "warning: unmappable character for encoding ASCII". If I modify it to "-encoding jjjj" it won't compile, complaining "error: unsupported encoding: jjjj", so I know it is recognizing UTF-8, but it still seems to be treated .java files as ascii. Sigh.
dfrankow
I tried the "encoding" parameter of the ant javac task, same problem. It recognizes the parameter, but then ignores it somehow.
dfrankow
A: 

If you use eclipse (Eclipse can put utf8 code for you even you write utf8 character. You will see normal utf8 character when you programming but background will be utf8 code) ;

  1. Select Project
  2. Right click and select Properties
  3. Select Resource on Resource Panel(Top of right menu which opened after 2.)
  4. You can see in Resource Panel, Text File Encoding, select other which you want

P.S : this will ok if you static value in code. For Example String test = "İİİİİııııııççççç";

javaloper
A: 

I had the same problem, where the character index reported in the java error message was incorrect. I narrowed it down to the double quote characters just prior to the reported position being hex 094 (cancel instead of quote, but represented as a quote) instead of hex 022. As soon as I swapped for the hex 022 variant all was fine.

Kelvin Goodson