ansaurus

Question

Java Internationalization

Answer 1

+6 A:

Because that method:

Encodes this String into a sequence of bytes using the platform's default charset

If your default charset is ie US-ASCII you won't get the same bytes used by that Chinese letter

I imagine an extra bit/byte may be added/droppped in the process.

Try using getBytes( String charSetName )

public byte[] getBytes(String charsetName)

Using the correct charsetName

OscarRyz 2009-10-19 23:09:18

Answer 2

+2 A:

The getBytes() method uses the default encoding. According to the docs:

The CharsetEncoder class should be used when more control over the encoding process is required.

Vincent Ramdhanie 2009-10-19 23:09:30

Answer 3

+1 A:

String t = new String(s.getBytes()); may create string using ASCII as default charset. Use following method to create the string with charsetName as UTF-8

String(byte[] bytes, int offset, int length, String charsetName)

jatanp 2009-10-19 23:10:03

Answer 4

+1 A:

Actually, I figured this out, sorry for the post. I was using the default Java Charset, instead of explicitly casting it as a UTF-8 Charset. It works now.

Jon 2009-10-19 23:14:18

ansaurus

tags:

views:

answers:

Java Internationalization

related questions