views:

61

answers:

2

Respected All,

I have to read xml files from server and display data from all files. some data contains character '�' which gives me SAXException while parsing. I have tried to convert UTF-8 format. but it gives me out of application as soon as that char is found in file. I have used SAXParser to parse xml file.

If you have any solution to this problem please help me.
Thank You

A: 

Hi vikram, you seem to have an exception due to an issue of encoding. If the character looks like ? in your xml, this means that the error is not in your application but in the XML, check that you are encoding the characters properly in your XML. If you are creating the XML with a PHP page for instance, you can use htmlentities() function to encode these. Can you put an extract of your xml or a link to it, to see where the issue is?

Sephy
Hi Sephy. As you gave answer for my question 'how to read xml file in UTF-8 format', i use it in my code and successful, now it gives me utf-8 formatted string. but small problem is that it displaying character '�' on my screen. this character is coming instead of ' this character. i cant use replace all because when i copy past this char in eclips then it gives me error like 'some charter cannot map using Cp1252'what i have to use to remove it.. Thank You Vikram
Vikram
this Cp1252 is stuff to correct in Eclipse. So, first of all, go in Eclipse, Window menu, perferences, General, Workspace. There you should have a list of preferences. Look for text file preferences of something like that. You are probably on cp1252, change it to utf8. I don't think it will solve your issue, but it will remove the error in Eclipse.
Sephy
A: 

I believe that by default theSAXParser will detect the encoding used. If it doesn't work you can always manually specify the encoding using the overloaded parse method.

If you do not know the encoding, you can wrap your parsing code in a try/catch block, and after getting a SAXException you could try re-parsing and specifying one. This last step could be done for a few encodings you would always want to try.

If that fails, or if the XML contains mix-matched encodings, you'll be out of luck.

JRL