Chinese/Japan/Korean char is double-byte or Unicode.
views:
44answers:
2
+1
A:
The mb_ereg functions implement multibyte-aware regexes. However, you really need to specify more exactly what you mean by "special char".
Edit: I think what you need is the Unicode character classes supported in PCRE, but I'm not sure whether these are supported by the mb_ereg functions or whether the preg functions work with multibyte strings.
Michael Borgwardt
2010-07-14 12:09:14
all that are not language char or digital is the "special"
lovespring
2010-07-14 12:13:45
+2
A:
The following regex should work; \p{P}
matches punctuations and \p{S}
matches symbols.
preg_replace("/\p{P}|\p{S}/u", "", $s);
I couldn't test it because my pcre doesn't support \p
, \x
etc. I got the error:
PHP Warning:
preg_replace()
: Compilation failed: support for\P
,\p
, and\X
has not been compiled at offset 1 in test.php on line 3
If you get this error, this page describes a fix
Amarghosh
2010-07-14 12:24:04