This depends heavily on the language (and regex engine) you're using.
In Perl, \w
matches all word characters, regardless of language or alphabet, and something like /\b(\w+)\b/
would (probably) match Spanish words as well as English words or Russian words.
In languages using PCRE, \w
(and therefore probably \b
) do NOT match Unicode characters. You will probably need to build your own set. I suggest something like [\wáéíóúñ]
(matches all word characters, plus the accented characters you want), and the PCRE library has to be pre-built with Unicode support before this will even work.
If you're using something else, good luck. Some regex engines don't even support Unicode.