tags:

views:

80

answers:

4

when i read the entire XML file in JEditorPane all works fine except the BOM charatcer. I get a BOM charatcer (a dot) at start of file. If i remove the dot and save file it is saved as ANSI.In notepad++ it shows (ANSI as UTF-8) encoding for the same file. If i dont remove the dot XML parser fails to parse the document. Can u help me with this.???? thanks.

+1  A: 

Continue use UTF-8 without BOM. Try Editplus go to menu Document->File Encoding ->Change File Encoding then chose UTF-8.

Artiya4u
+1  A: 

If your XML file only contains ASCII characters it will be valid ASCII/ANSI as well as valid UTF8, so don't worry about Notepad++ recognizing the file as ANSI.

While you can use a BOM for UTF8, it is discouraged because it will break a lot of Unix programs and you really shouldn't do it.

klaus
A: 

Using the -D option of the java command, set the system property file.encoding, as suggested in this answer.

java -Dfile.encoding=utf-8
trashgod
A: 

Problem:

utf-8 does not use the BOM, so most programs don't expect it and fail to parse/handle it. As far as I know only some Microsoft programs insert it to detect the utf-8 encoding faster.

Solution:

  • Remove the BOM, nobody needs it.
  • Don't use buggy editors with non standard encoding. (=> my opinion)
josefx