I know that in normal php regex (ASCII mode) "\w" (word) means "letter, number, and _". But what does it mean when you are using multibyte regex with the "u" modifier?
preg_replace('/\W/u', '', $string);
I know that in normal php regex (ASCII mode) "\w" (word) means "letter, number, and _". But what does it mean when you are using multibyte regex with the "u" modifier?
preg_replace('/\W/u', '', $string);
Anything that isn't a letter, number or underscore.
So, in terms of Unicode character classes, \W
is equivalent to every character that are not in the L or N character classes and that aren't the underscore character.
If you were to write it using the \p{xx}
syntax, it would be equivalent to [^\p{LN}_]
.