views:

67

answers:

3

hi there!

i got this crappy website i need to parse and the html-element i need to get the contents of contains "" symbols. the actual html of this page looks like this:

<td>Mais-Lauch-R&ouml;sti <font color=#000000 size=1>(1,2,9,11)</font> mit Paprikasauce <font color=#000000 size=1>(3,9)</font><nobr><b> 2,10 &euro;</b></nobr><br/>........

so i use DOM to get the contents of the element. unfortunately, this ends up like the following code (via var_dump()):

string(270) "Mais-Lauch-Rösti (1,2,9,11) mit Paprikasauce (3,9) 2,10 €.........

(dom seems to strip all containing tags when using sth like $td->item(0)->nodeValue;)

so the &euro; was parsed to - fine. but when i try to split the string (that is actually a little longer than the posted excerpt) by the €-symbol by using

$data = explode("€", $data);

it won't work. explode() just won't detect the € symbol. i tried splitting by "&euro;", but this won't work either. i also tried using str_replace() and preg_replace() - but none of them would recognize the symbol :(

am i missing something? what am i doing wrong?

+3  A: 

It's still &euro; in the string - it just displays in the browser as €. You'll need to split on &euro; instead.

Skilldrick
to quote myself: i tried splitting by "€", but this won't work either
xenonite
+1  A: 

$data = explode("&euro;", $data);

Tim Green
thats exactly what i tried
xenonite
A: 

tried it with simple php dom parser... it works :)

xenonite