The code should be compiled with the correct encoding:
javac -encoding UTF-8 Foo.java
There'll be an encoding mismatch there somewhere.
public class Foo {
char [] a = {'à', 'á', 'â', 'ä' };
}
The above code saved as UTF-8 should become the hex dump:
70 75 62 6C 69 63 20 63 6C 61 73 73 20 46 6F 6F public class Foo
20 7B 0D 0A 20 20 63 68 61 72 20 5B 5D 20 61 20 {__ char [] a
3D 20 7B 27 C3 A0 27 2C 20 27 C3 A1 27 2C 20 27 = {'__', '__', '
C3 A2 27 2C 20 27 C3 A4 27 20 7D 3B 20 20 0D 0A __', '__' }; __
7D 0D 0A 0D 0A }____
The UTF-8 value for code point U+00E0 (à) is C3 A0
.
The code should be compiled with the correct encoding:
javac -encoding UTF-8 Foo.java
There is an outside chance that à will be represented by the combining sequence U+0061 U+0300. This is the NFD form (I've never come across a text editor that used it as a default for text entry). As Thorbjørn Ravn Andersen points out, it is often better to always use \uXXXX escape sequences - it is less ambiguous.
You also need to check your input device (file/console/etc.)
As a last resort, you can dump your char
s as hex System.out.format("%04x", (int) c);
and try manually decoding them with a character inspector to find out what they are.