views:

1037

answers:

6

Hi,

I've been trying to redirect System.out PrintStream to a JTextPane. This works fine, except for the encoding of special locale characters. I found a lot of documentation about it (see for ex. mindprod encoding page), but I'm still fighting with it. Similar questions were posted in StackOverFlow, but the encoding wasn't addressed as far as I've seen.

First solution:

String sUtf = new String(s.getBytes("cp1252"),"UTF-8");

Second solution should use java.nio. I don't understand how to use the Charset.

Charset defaultCharset = Charset.defaultCharset() ;
byte[] b = s.getBytes();
Charset cs = Charset.forName("UTF-8");
ByteBuffer bb = ByteBuffer.wrap( b );
CharBuffer cb = cs.decode( bb );
String stringUtf = cb.toString();
myTextPane.text = stringUtf

Neither solution works out. Any idea?

Thanks in advance, jgran

A: 

String in java does not have an encoding - Strings are backed by a character array, and character should always be utf-16 while they are treated as strings and char values.

The encoding only comes into question when you export or import strings/chars to or from an external representation (or location). The transfer must take place using a sequence of bytes to represent the string.

I think the first solution is close, but also totally confused. First you ask java to translate the char values to their cp1252-encoded equivalent values (the 'word'for the similarily-shaped character in the cp1252 'language'). Then you create a string from this byte sequence, stating that this sequence of cp-1252 codes is in fact a sequence of utf-8 codes and should be translated to the standard in-memory representation (utf-16) from utf-8.

A string is never utf og cp1252 or anything like that - it is alsways characters. Only byte sequences are utf-8 or cp1252. If you want to translate char values to a utf-8 string you could use.

byte[] utfs = myString.getBytes("UTF-8");

Actually, I think the problem lies elsewhere, probably inside the printstream and how it prints its input. You should try to avoid converting strings and chars to/from bytes, because that is always a major source of confusion and trouble. Perhaps you must override all methods in order to capture character data before conversion.

eirikma
A: 

You should create the PrintStream with the right encode: http://tinyurl.com/ybooutp

Could you please provide more code about what are you trying to do?

marcospereira
A: 

Thank you for your help.

Here's a small Groovy snippet which sums up the situation:

import groovy.swing.SwingBuilder
import javax.swing.*
import javax.swing.text.*

//** originally inspired from http://unserializableone.blogspot.com/2009/01/redirecting-systemout-and-systemerr-to.html
class JTextPaneOutputStream extends OutputStream {
    JTextPane tp;
    Document doc;

    public JTextPaneOutputStream(JTextPane t) {
        super();
        tp = t;
        doc = tp.getDocument();
    }

    public void write(int i) {
        //*** Encoding Problem Here ??
        String s = Character.toString((char)i);
        //String s = String.valueOf((char) i)
        doc.insertString(doc.length, s, null);
        tp.setCaretPosition(doc.length -1)
    }
}
def originalOutput = System.out
def swing = new SwingBuilder()
swing.frame(title: "hi", size: [200, 200], visible: true, pack:true){
    panel(){
        textPane(id: "idTextPaneStdOut", preferredSize: [200,200])
        button(text: "click", actionPerformed: { 
            tpos = new JTextPaneOutputStream(swing.idTextPaneStdOut)
            newOut = new PrintStream(tpos, true, "UTF-8")
            System.setOut(newOut)
            println "café résumé voilà"
        })}}

So how to encode String s as to make diacritic character appear nicely? Thanks again

Best regards

jgran
A: 

As you rightfully assume the problem is most likely in:

String s = Character.toString((char)i);

since you encode with UTF-8, characters may be encoded with more than 1 byte and thus adding each byte you read as a character won't work.

To make it work you can try writing all bytes into a ByteBuffer and using a CharsetDecoder (Charset.forName("UTF-8).newDecoder(), "UTF-8" to match the PrintStream) to convert them into characters which you add the panel.

I haven't tried it to make sure it works, but I think it is worth a try.

Carsten
A: 

Hi,

Thanks for the reply.

But still cannot get it to work!. I'm really lost with java.nio and java.nio.charset packages Where's the problem (which might be obvious)? Should I decode from "Cp1252" and encode into "UTF-8" or am I completelylost?

jgran
A: 

Try this code:

public class MyOutputStream extends OutputStream {

private PipedOutputStream out = new PipedOutputStream();
private Reader reader;

public MyOutputStream() throws IOException {
    PipedInputStream in = new PipedInputStream(out);
    reader = new InputStreamReader(in, "UTF-8");
}

public void write(int i) throws IOException {
    out.write(i);
}

public void write(byte[] bytes, int i, int i1) throws IOException {
    out.write(bytes, i, i1);
}

public void flush() throws IOException {
    if (reader.ready()) {
        char[] chars = new char[1024];
        int n = reader.read(chars);

        // this is your text
        String txt = new String(chars, 0, n);

        // write to System.err in this example
        System.err.print(txt);
    }
}

public static void main(String[] args) throws IOException {

    PrintStream out = new PrintStream(new MyOutputStream(), true, "UTF-8");

    System.setOut(out);

    System.out.println("café résumé voilà");

}

}

Vilmantas Baranauskas
Thanks a lot for your solution which could be quickly adapted to my own problem. Regards.
jgran