views:

494

answers:

2

Hey,

How would I go about removing the "&" symbol from a string. It's making my xml parser fail.

I have tried

[currentParsedCharacterData setString: [currentParsedCharacterData stringByReplacingOccurrencesOfString:@"&" withString:@"and"]];

But it seems to have no effect

A: 

For a quick attempt look at stringByReplacingOccurrencesOfString on NSString

NSString* str = @"a & b";
[str stringByReplacingOccurrencesOfString:@"&" withString:@"and"]; // better replace by &

However you should also deal with other characters i.e. < >

Mark
+2  A: 

Really what this boils down to is you want to gracefully handle invalid XML. The XML Parser is properly telling you that this XML is invalid, and is thusly failing to parse. Assuming you have no control over this XML content, I would suggest pre-parsing it for common errors like this, the output of which would be a sanitized XML doc that has a better chance of success.

To sanitize the doc, it may be as simple as doing search and replace, the problem with just doing a blanket replace on any & is that there are valid uses of &, for example &amp; or &copy;. You would end up munging the XML by creating something like this: andcopy;

You could search for "ampersand space" but that won't catch a string that has an ampersand as the last character (an out-case that might be easily handled). What you are really searching for are occurrences of & that are not followed by a ; or those of which where any type of whitespace is encountered before the following ; because the semi-colon is fine on its own.

If you need more power because you need to detect this, and other errors, I would suggest going to NSScanner or RegEx matching to search for occurrences of this and other common errors during your sanitization step. It is also very common for XML files to be rather large things, so you need to be careful when dealing with these as in-memory strings as this can easily lead to application crashes. Breaking it up into manageable chunks is something NSScanner can do very well.

slf