tags:

views:

36

answers:

2

I need to use a few Cyrillic characters in a Java file and in order for Eclipse to allow me to do so I need to change the encoding for that file (currently to UTF-8).

Are there any possible problems that this could cause?

+1  A: 

If the eclipse setting ever gets lost, or the program is built outside eclipse, the cyrillic characters could get corrupted without anyone noticing until the program performs the operations depending on them. This may or may not be an acceptable risk.

Assuming that this is about the program described in this question, a more robust alternative would be to put the cyrillic characters in an external file instead of directly into the source code, and parse that file using UTF-8 explicitly.

Michael Borgwardt
Thanks so much! That's exactly what it is about. Could you elaborate a little more on parsing using UTF-8. Are there any key methods I should use?
Emanuil
@Emanuil: simply use an InputStreamReader and specify the encoding when you read the file. Or use a file format like XML where the encoding is specified by the file itself (requires the proper header and using a proper XML parser that operates on the file directly).
Michael Borgwardt
Thanks again! You've been very helpful.
Emanuil
+1  A: 

If it is just a few characters, you can use the \uxxxx notation:

    char[][] translate = { 
        {'\u0430', 'a'},
        {'\u0431', 'b'},
        {'\u0432', 'v'},
        {'\u0433', 'g'},
        ...
    };  

also have a look at the native2ascii tool that comes with the JDK to convert native text to unicode latin-1.

Please note: English is not my first nor my second language, any help would be appreciated

Carlos Heuberger