(Sorry if a newb question...I've done quite a bit of research, honestly...)
I'm writing some Ruby on Rails code to parse RSS/ATOM feeds. My code is throwing-up on on a pesky '£' symbol.
I've been trying the approach of normalizing the description and title fields of the feeds before doing anything else:
descr = self.description.mb_chars.normalize(:kc)
However, when it hits the string with the '£', I'm guessing that mb_chars hits a problem and returns a regular Ruby String object. I get the error:
undefined method `normalize' for #<String:0x5ef8490>
So what is the best process to defensively prep these strings for insertion into the database? (I need to do a bunch of string processing on them as well)
My problem is compounded in that I don't know the format of the feed I'm processing. For instance, I've had some luck with the following line:
descr = Iconv.new('UTF-8//IGNORE', 'UTF-8').iconv descr
However, when it encounters the '£' it simply truncates everything after that point.
When I display the '£' symbol with the String.inspect function, it displays at '\243'. Failing a method to 'correctly' deal with this symbol, I'd be happy enough to substitute it for another value (like 'GBP'). So help with that code would be appreciated as well.
The feed in question is http://www.dailymail.co.uk/sport/football/index.rss