tags:

views:

2716

answers:

2

Is there a way to change the encoding used by the String(byte[]) constructor ?

In my own code I use String(byte[],String) to specify the encoding but I am using an external library that I cannot change.

String src = "with accents: é à";
byte[] bytes = src.getBytes("UTF-8");
System.out.println("UTF-8 decoded: "+new String(bytes,"UTF-8"));
System.out.println("Default decoded: "+new String(bytes));

The output for this is :

UTF-8 decoded: with accents: é à
Default decoded: with accents: é à

I have tried changing the system property file.encoding but it does not work.

+1  A: 

Quoted from defaultCharset()

The default charset is determined during virtual-machine startup and typically depends upon the locale and charset of the underlying operating system.

In most OSes you can set the charset using a environment variable.

jrudolph
Not really the answer I hoped for (I would have liked to be able to do it dynamically). Giving a sample of how to change the encoding for major OSes would be great. Thanks
Michel
+3  A: 

You need to change the locale before launching the JVM; see:

Java, bug ID 4163515

Some places seem to imply you can do this by setting the file.encoding variable when launching the JVM, such as

java -Dfile.encoding=UTF-8 ...

...but I haven't tried this myself. The safest way is to set an environment variable in the operating system.

Mat Mannion