I want to match all individual words in given string, provided that String is UTF-8 encoded, and then I spellcheck each word. Everything works with my code provided it's english-only text, but if there are some, say, German characters, my words are split in two on these characters. How can I match single words from text, that contain latin and not-latin characters?
What I do now is:
text.gsub(/[\w\']+/) do |word| "replacement" end
but this, for text containing "oooäuuu" will end up with "replacementäreplacement", i.e: German characters are not being treated as part of word.