tags:

views:

10385

answers:

5

Is there an easy way to avoid dealing with text encoding problems?

+1  A: 

The obvious names for these classes are ReaderInputStream and WriterOutputStream. Unfortunately these are not included in the Java library. However, google is your friend.

I'm not sure that it is going to get around all text encoding problems, which are nightmarish.

There is an RFE, but it's Closed, will not fix.

Tom Hawtin - tackline
+7  A: 

You can't really avoid dealing with the text encoding issues, but there are existing solutions,

Reader to InputStream: http://www.koders.com/java/fid0A51E45C950B2B8BD9365C19F2626DE35EC09090.aspx

Writer to OutputStream: http://www.koders.com/java/fid5A2897DDE860FCC1D9D9E0EA5A2834CC62A87E85.aspx?s=md5

You just need to pick the encoding of your choice

Peter
FYI: the ReaderInputStream code has a bug in the way it reads bytes (it will not work for all encodings). Proof: http://illegalargumentexception.blogspot.com/2009/05/java-rough-guide-to-character-encoding.html#javaencoding_stringclass There is an open bug: https://issues.apache.org/bugzilla/show_bug.cgi?id=40455
McDowell
+1  A: 

Are you trying to write the contents of a Reader to an OutputStream? If so, you'll have an easier time wrapping the OutputStream in an OutputStreamWriter and write the chars from the Reader to the Writer, instead of trying to convert the reader to an InputStream:

final Writer writer = new BufferedWriter(new OutputStreamWriter( urlConnection.getOutputStream(), "UTF-8" ) );
int charsRead;
char[] cbuf = new char[1024];
while ((charsRead = data.read(cbuf)) != -1) {
    writer.write(cbuf, 0, charsRead);
}
writer.flush();
// don't forget to close the writer in a finally {} block
Sam Barnum
A: 

Also note that, if you're starting off with a String, you can skip creating a StringReader and create an InputStream in one step using org.apache.commons.io.IOUtils from Commons IO like so:

InputStream myInputStream = IOUtils.toInputStream(reportContents, "UTF-8");

Of course you still need to think about the text encoding, but at least the conversion is happening in one step.

+5  A: 

If you are starting off with a String you can also do the following:

new ByteArrayInputStream(inputString.getBytes("UTF-8"))
Ritesh Tendulkar