In Python I could've converted it to Unicode and do '(?u)^[\w ]+$' regex search, but PHP doesn't seem to understand international \w, or does it?
views:
367answers:
3Although I haven't tested myself, looking at http://us3.php.net/manual/en/reference.pcre.pattern.syntax.php suggests the following: '/^[\p{L} ]+$/u' would work - the \p{L} will match any unicode letter. Additionally, you can apparently write this without the curly brackets - '/^[\pL ]+$/u'.
afaik PHP isn't aware of utf8, meaning that php itself won't be able to process it other than bytewise.
PHP believes everything is latin1, but there is however extensions that might be useful for you, like mbstring.
Getting UNICODE working properly everywhere in the code base is one of the "big" features of PHP6.
Until then the word is you are recommended NOT to use UNICODE in php due to numerous security problems that can develop from it.
A lot of the code just isn't UNICODE aware, and thus not safe and exploits can get through it in ways that are really unpleasant.