ansaurus

Question

Answer 1

+1 A:

The problem isn't with html_entity_decode(). The problem is that SimpleXML is treating the contents of the <text> element as XML instead of text. By default, SimpleXML compresses empty elements (<a></a> to <a />). One way to get around this is to import the SimpleXML object into a DOM object, and use the LIBXML_NOEMPTYTAG option when saving the output. The problem with this option is that any <br /> elements will be output as <br></br>.

The simpler alternative is to use a different response format from the API. I would suggest using the json response format and use the json_decode() function to parse the response.

Jordan Ryan Moore 2009-12-11 17:02:08

Thanks for your answer. I think you are right.

Azimuth 2009-12-11 17:17:46

Answer 2

+1 A:

That's not strange output, that's valid XML. When you have an empty tag, XML lets you use a short closing syntax that's not always valid in HTML or XHTML

<foo></foo>
<foo />

The html_entity_decode(); function converts html entities, such as

&gt; converts to
>

You'll need to post-process your xml fragment and convert it into proper HTML. The easiest way to do this is with the DomDocument API.

$foo = new DomDocument();
$foo->loadHtml('<p> Testing <a href="" /> </p>'); 
echo $foo->saveHtml();

This will take an XML fragment, and convert it into and HTML document, which includes fixing all the self closing tags. You'll still need to parse out stuff in the <body/>, but that's a lot easier than fixing all the self closing tags yourself.

Alan Storm 2009-12-11 17:03:59

@Alan, please read my comment to the first answer

Azimuth 2009-12-11 17:09:04

ansaurus

tags:

views:

answers:

PHP html_entity_decode and HTML <a> tag

related questions