views:

62

answers:

1

I have a site scraped into $html variable.

now i want to replace some chars with this expression

$string1 = preg_replace('/[^A-Za-z0-9äöü!&_=\+-]/i', ' ', $string);

The Problem is there are special characters caused by different charsets.

I have a variable $charset in which the charset string of the page is saved. i.e. $charset="utf-8" or iso-8859-1 in utf-8 it's the german letter ü i want to replace in iso-8859-1 it's ü

Is there a possibility to tell the replace function according to the charset of the page without making separate Regular Expressions for each possible charset?

A: 

Or you can try adding

utf8_encode($string);

RIGHT BEFORE preg_replace. I'm not sure it will solve your problem, but it might.

For more information, see: http://se2.php.net/manual/en/function.utf8-encode.php.

matsolof
thanks, also interesting
ndi
Glad I could help. Did you try utf8_encode? Did it work?
matsolof