views:

544

answers:

3

Hi all,

We are using NSXMLParser in Objective-C to parse our XML document, which are all UTF-8 encoded. One document has a string "Nestlé" in it (as in ...<title>Nestlé Novelties</title>...). The parser just quit, reporting an error with error code=9, due to the French letter "e" at the end of the word "Nestle". Furthermore, we tried using IE, Chrome, Safari to show the same document directly. They reported a similar encoding error.

We are using UTF-8 for all incoming XML document, which means that all of them have "<?xml version="1.0" encoding="UTF-8" ?>" as the top of the document.

Is this an encoding problem? If so, how do we solve this? What encoding should we use for all of our XML documents? Thanks in advance!

Barclay

+4  A: 

Have you checked the file with a hex editor to verify that the "é" is indeed UTF-8, 0xC3 0xA9 ?

Anders Lindahl
A: 

In HTML, I would use Nestl&eacute; Does that work for your application?

Nosredna
Wouldn't work in XML -- only HTML (and XHTML) contain the 'é' entity.
Jim Dovey
A: 

Something I saw just now in an example XML file was that a string containing user-defined input (which happened to include é characters) wrapped the contents of the containing tag in CDATA declarations. This has the effect of making the parser completely ignore the characters contained therein.

Jim Dovey