tags:

views:

44

answers:

2

Chinese/Japan/Korean char is double-byte or Unicode.

+1  A: 

The mb_ereg functions implement multibyte-aware regexes. However, you really need to specify more exactly what you mean by "special char".

Edit: I think what you need is the Unicode character classes supported in PCRE, but I'm not sure whether these are supported by the mb_ereg functions or whether the preg functions work with multibyte strings.

Michael Borgwardt
all that are not language char or digital is the "special"
lovespring
+2  A: 

The following regex should work; \p{P} matches punctuations and \p{S} matches symbols.

preg_replace("/\p{P}|\p{S}/u", "", $s);

I couldn't test it because my pcre doesn't support \p, \x etc. I got the error:

PHP Warning: preg_replace(): Compilation failed: support for \P, \p, and \X has not been compiled at offset 1 in test.php on line 3

If you get this error, this page describes a fix

Amarghosh
it works, thank you.
lovespring