we have this code:
$value = preg_replace("/[^\w]/", '', $value);
where $value
is in utf-8. After this transformation first byte of multibyte characters is stripped. How to make \w cover UTF-8 chars completely?
Sorry, i am not very well in PHP
we have this code:
$value = preg_replace("/[^\w]/", '', $value);
where $value
is in utf-8. After this transformation first byte of multibyte characters is stripped. How to make \w cover UTF-8 chars completely?
Sorry, i am not very well in PHP
try this function instead...http://php.net/manual/en/function.mb-ereg-replace.php
You could try with the /u modifier:
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.
If that won't do, try
mb_ereg_replace
- Replace regular expression with multibyte supportinstead.
There is this nasty u
modifier to pcre patterns in PHP. It states that the regex is encoded in UTF8, but I found that it treats the input as UTF8, too.