tags:

views:

227

answers:

1

How can I split a String by Unicode range in Ruby. I wanted to split under \u1000 and over \u1000 with comma. For example, I wanted to split this string...

I love ျမန္မာ

to this...

I love, ျမန္မာ

You may not see the Unicode Characters in my example. It's Unicode range \u1000 and over.

Thanks.

+2  A: 

Depends on which version you are using; here is a solution for 1.9. I imagine 1.8 could get ugly.

This falls down on elegance, but seems to work.

"I love ျမန္မာ".gsub(/([\u0000-\u0999])([\u1000-\u9999])/, '\1,\2')

If this method is suitable, you'll have to supply the other case (high to low transition)

Justin Love