views:

28

answers:

1

I'm linkifying @mentions in status messages returned by Twitter's API.

One of the tweets has a unicode character in it. Parsing the JSON (with either the json gem's JSON.parse or ActiveSupport::JSON.decode) returns a string that displays correctly, but the indices for the start and end of the @mention specified by the entity don't match up with the parsed string.

How can I transform the unicode string in Ruby such that the indices of a character behave as expected (e.g., they treat the unicode character as a single character)?

The text of the tweet is:

Thanks! RT @Apigee Have an API? Consider adding a method for simulating errors\u2014an excellent idea from @andrewacove: http://bit.ly/aupTLp ^MG

A: 

If you are using ruby on rails you can use string.mb_chars.length. See: http://api.rubyonrails.org/classes/ActiveSupport/CoreExtensions/String/Multibyte.html

John Drummond