views:

52

answers:

3

Hello everyone,

What are the guidelines i should follow to avoid encoding issues when reading files or converting string to bytes, bytes to streams streams to reader etc. Any important notes, tutorials would also help.

Best Regards, Keshav

A: 
  • String<->byte[] conversion is not always fully possible. So avoid it if possible. (see this question)

  • when reading (the same goes for writing) a stream specify the desired encoding using

    new InputStreamReader(inputStream, charset)
    
Bozho
A: 

My advice is to use java.io.Reader / java.io.Writer if possible, and set the character when you use InputStreamReader / OutputStreamWriter.

Thomas Mueller
+2  A: 
  • Avoid converting between bytes and strings needlessly
  • Be aware that werever you convert between bytes and strings, there is an encoding involved, implicitly or explicitly
  • Be very careful to avoid the API calls that use the platform default encoding (first and foremost: FileReader/Writer) unless you are handling user-supplied data without an explicitly declared encoding
  • If the file format / network protocol does have an explicit encoding declaration, make sure you use it correctly
Michael Borgwardt