tags:

views:

308

answers:

3

I'm a newbie in Java so please bear with me if this is a very easy problem. I have a JUnit Test where I have a hardcoded Japanese word assigned directly to a string variable. Now right after that string is assigned, it turns to "??" meaning that the encoding is incorrect somewhere.

public class TestTest extends TestCase {
  public void testLocal(){
    Locale.setDefault(Locale.JAPAN);//same problem with or without this line
    String test = "会社";
    //after this line, by watching at the debugger, the variable "test" contains "??"
    assertEquals("会社", test);
  }
}

Because this is a testcase, I believe it completely isolates the problem from other UI environments. Please help me in this. Been 2 days with no solution. Thank you in advance.

+1  A: 

If you've got the same exact string twice, it shouldn't really matter what encoding is being used... but I would suggest using the \uxxxx escape format to make it clear which Unicode characters are actually being used. That way it's basically encoding-independent.

If you really want to use string literals with Japanese in your code, check that all your build tools (etc) agree on the file encoding you're using. This will vary between IDE, Ant etc. (It's the -encoding flag for javac, for example.)

Jon Skeet
A: 

A little update on my earlier comment: I was able to reproduce your results, with the question marks. I did exactly as you did except I changed my shell default LANG settings.

The reason you are (might) be getting questions marks is because your environment locale does not match your intended locale. Try doing this first in your shell(Bash):

export LANG="ja_JP.UTF-8"

or on Windows:

set LANG=ja_JP.UTF-8

If that doesn't work, you can try from your command prompt: chsh 65001 then run your java program. Sorry to throw out all these suggestions..hope it works!

Hayato
Thank you for your answer.I tried to set the LANG (I am in Windows so I hope it is the same), but I have the same problem.
Eugene Ramirez
Edited above answer ^
Hayato
A: 

If your debug output is depending on System.out, it is possible that the output is being converted to the default encoding of your platform.

I always run with -Dfile.encoding=UTF8 when wanting to support international character sets (which is nearly always!)

i.e. run as: java -Dfile.encoding=UTF8 MyApp

(NOTE: If you are not running from a CLI, there may be other ways you set these properties)

Glenn