views:

546

answers:

1

I have a requirement to read a RTF file with thai characters and write it to a text file. I tried using TIS-620, MS874,ISO-8859-11,but thai characters are not displaying properly when i open the resulting output file in notepad or textpad. But it works well with Wordpad. Please guide me.

Thanks and Regards, Ramya.

Code that solved the problem (posted in comment, adding here to make it readable!):

FileInputStream fin = new FileInputStream(fileName);
DataInputStream din = new DataInputStream(fin);
//creating a default blank styled document
DefaultStyledDocument styledDoc = new DefaultStyledDocument();
//Creating a RTF Editor kit
RTFEditorKit rtfKit = new RTFEditorKit();
//Populating the contents in the blank styled document
rtfKit.read(din,styledDoc,0);
// Getting the root document
Document doc = styledDoc.getDefaultRootElement().getDocument();
//Printing out the contents of the RTF document as plain text
System.out.println(doc.getText(0,doc.getLength()));
A: 

I don't think notepad handles all character encodings, from a little Googling. Could you try re-encoding the characters into UTF-8 (or some other unicode format), since Notepad does handle that correctly? You'll want to use the BOM.

I also stumbled across a tool for converting files in Thai into various other encodings.

Finally, is there a requirement that the files can be opened in Notepad? It's not as if Notepad is the last word in text editing.

Dominic Rodger
FileInputStream fin = new FileInputStream(fileName); DataInputStream din = new DataInputStream(fin); //creating a default blank styled document DefaultStyledDocument styledDoc = new DefaultStyledDocument(); //Creating a RTF Editor kit RTFEditorKit rtfKit = new RTFEditorKit();//Populating the contents in the blank styled document rtfKit.read(din,styledDoc,0); // Getting the root document Document doc = styledDoc.getDefaultRootElement().getDocument(); //Printing out the contents of the RTF document as plain text System.out.println(doc.getText(0,doc.getLength()));
How did that solve the problem? That doesn't do anything with encodings of the file output stream at all!
Dominic Rodger