Hi Guys,
I am parsing text/html from web pages into an xml feed, the text/html is encoded iso-8895-1 while the XML feed must be UTF-8. I have used html entities, but am having to manually replace loads of characters, here is what I have so far (still not parsing all text)
$desc = str_replace(array("\n", "\r", "\r\n"),"",$desc);
$desc = str_replace(array("’","‘","”","“"),"'",$desc);
$desc = str_replace("£","£",$desc);
$desc = str_replace("é","é",$desc);
$desc = str_replace("²","2",$desc);
$desc = str_replace(array("-","•"),"‐",$desc);
$desc = htmlentities($desc, ENT_QUOTES, "UTF-8");