views:

360

answers:

2

Hi,

i was comparing A string coming from XML with another String and results were showing that they are not equal. but in NSLog() both were same ( e.g. Valore Books ).

then i checked the Source of the XML and i came to know that the actual string is "Valore Books" and   is infact a space. but the problem is this when i am comparing it with @"Valore Books", it is saying both are not same.

What to Do ??

+1  A: 

Note: I'm replacing my original answer with one that's actually correct for this problem. Sorry for the initial misunderstanding.

The following line will unescape the html entities in your string.

NSString *A = @"Valore Books";
NSString *B = (NSString *)CFXMLCreateStringByUnescapingEntities(NULL, (CFStringRef)A, NULL);

I couldn't find any equivalent function that was higher level, but the performance of this should be excellent. If I read the docs correctly, you can pass in a CFDictionaryRef as the third argument to specify extra conversions, but it seems that this does a good job doing standard ones on it's own.

Docs are here.

Note that it's probably a good idea to handle the encoding whereever you're pulling those strings into your program at, and not everytime you're comparing.

Also found a second part of this you need to consider. &#160 isn't just a space, it's a non breaking space, which the above code converts to \312 instead of the standard space. Those are in fact separate characters in the encoding and when you do a string compare it will fail.

Maybe it'd be easiest to replace #160 with #32 using

- (NSString *)stringByReplacingOccurrencesOfString:(NSString *)target withString:(NSString *)replacement

and then running it through the unescape.

It also just occurred to me that the CFXMLCreateStringByUnescapingEntities won't be available on the iphone. Here is a link to an example that shows how to do similar conversions on the iphone.

Bryan McLemore
` ` is not a percent encoding but an HTML entity. So that method won't help here.
Ole Begemann
Fair enough although google searches seem to indicate it would, but I haven't tested it. I'll give it a shot when I get into my office though.
Bryan McLemore
A: 

  is a non-breaking space (Unicode value U+00A0) The regular space is (in @"Valore Books") has Unicode value U+0020. So it is not the same character, and the two strings are not equal.

Mihai Nita