views:

155

answers:

1

i want to filter a String by using the \w wildcard, but unfortunately it does not cover umlauts.

$i = "Die Höhe";    
$x = preg_replace("/[^\w\s]/","",$i);
echo $x; // "Die Hhe";

However, i can add all the characters to preg_replace, but this is not very elegant, since the list will become very long. ATM, i am preparing this only for German, but there are more languages to come.

$i = "Die Höhe";    
$x = preg_replace("/[^\w\säöüÄÖÜß]/","",$i);
echo $x; // "Die Höhe";

Is there a way to match all of them at once?

+1  A: 

You strings are obviously UTF-8, so you want the 'u' flag and unicode properties instead of \w

$x = preg_replace('/[^\p{L}\p{N} ]/u',"",$i);
stereofrog