views:

614

answers:

3

Hi,

I use NSXMLParser for parsing XML documents of a server. They are encoded as UTF8. My problem is, that NSXMLParser breaks at umlauts (ä, ö, ü) and starts a new element.

For example:

Lösen -- NSXMLParser ---> L + ösen

How do I get NSXMLParser to read my umlaut words completely, as every other word.

Regards

+1  A: 

I ran into that issue with spanish characters check out this post
http://www.iphonesdkarticles.com/2008/11/parsing-xml-files.html
especially the function
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string

im sure if you get the found characters section working well with the didEndElement function, you'll be fine. :D

James Hall
+6  A: 

Sorry but based on your comment on the original question (foundCharacters receiving the text in two calls) the parser is behaving perfectly well. See the "Discussion" section for the parser:foundCharacters: method quoted below:

The parser object may send the delegate several parser:foundCharacters: messages to report the characters of an element. Because string may be only part of the total character content for the current element, you should append it to the current accumulation of characters until the element changes.

As you can see the parser is free to pass your delegate the characters in as many chunks as it sees fit.

imaginaryboy
+2  A: 

foundCharacters: is not delinited by tags, you need to concatentate the characters passed in unti lthe next call to didEndElement.

Roger Nolan