views:

104

answers:

4

Hello

I am applying the following function

<?php

function replaceChar($string){
    $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâäåìíîïùúûüýÿ]/", "", $string);
    return $new_string;
}

$string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿ";

echo replaceChar($string);
?>

which works fine but if I add ã to the preg_replace like

$new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâãäåìíîïùúûüýÿ]/", "", $string);

$string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿã";

It conflicts with the pound sign £ and replaces the pound sign with the unidentified question mark in black square.

This is not critical but does anyone know why this is?

Thank you,

Barry

UPDATE: Thank you all. Changed functions adding the u modifier: pt2.php.net/manual/en/… – as suggested by Artefacto and works a treat

function replaceChar($string){
$new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõøöàáâãäåìíîïùúûüýÿ]/u", "", $string);
return $new_string;
}
+2  A: 

Chances are that your string is UTF-8, but preg_replace() is working on bytes

Mark Baker
A: 
Mihai Iorga
I've tried this but it didn't work, thankyou
Barry Ramsay
A: 

You might want to take a look at mb_ereg_replace(). As Mark mentioned preg_replace only works on byte level and does not work well with multibyte character encodings.

Cheers,
Fabian

halfdan
Not entirely true. preg_replace can work with UTF-8 strings. Se here the u modifier: http://pt2.php.net/manual/en/reference.pcre.pattern.modifiers.php
Artefacto
Barry Ramsay
Thank you guys $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâãäåìíîïùúûüýÿ]/u", "", $string);Works a treat
Barry Ramsay
A: 

If your string is in UTF-8, you must add the u modifier to the regex. Like this:

function replaceChar($string){
    $new_string = preg_replace("/[^a-zA-Z0-9\sçéèêëñòóôõöàáâäåìíîïùúûüýÿ]/u", "", $string);
    return $new_string;
}

$string = "This is some text and numbers 12345 and symbols !£%^#&$ and foreign letters éèêëñòóôõöàáâäåìíîïùúûüýÿ";

echo replaceChar($string);
Artefacto