If I add this to the beginning of my script:
$KCODE = 'UTF8'
require 'jcode'
then I can walk over every char of a word containing unicode characters. Imagine a word containing umlauts or something, and I iterate over them like this:
word.each_char do |c|
# do something with c
end
If c
is a unicode character and I print it's size, it will be 2 ( composed of 2 characters ). How can I get c
's code? Is there some formula I could use, or is there something in the std lib that can do this?