views:

111

answers:

3

Hey all, Simple question. Is it possible to add a block of HTML in a SimpleXMLElement (or as a matter of fact, DOMDocument) node without auto-converting the HTML data into entity format?

For example, take this snippet (with DOMDocument here, but SimpleXMLElement behaves exactly the same):

<?php
$dom = new DOMDocument( '1.0', 'utf-8' );
$de = $dom->createElement( 'content', '<p>some <a>stuff</a></p>' );
$dom->appendChild( $de );
echo $dom->saveXML();
?>

The output is:

<p>some <a>stuff</a></p>

If you take a look at the source, you'll see:

<?xml version="1.0" encoding="utf-8"?>
<content>&lt;p&gt;some &lt;a&gt;stuff&lt;/a&gt;&lt;/p&gt;</content>

... the HTML block got auto converted into the entity format.

Even wrapping the block with CDATA doesn't help, as the angular brackets of CDATA gets converted too.

So, is there a way to add HTML blocks like this without performing this auto-conversion?

Thanks, m^e

+1  A: 

Actually, this behaviour is quite wanted. You create a new element (content) and assign a text node to it. If the text contains XML special characters they are converted in the final serialization.

If you do not want this behaviour, you have to explicitly create element nodes out of your string in the first place. This you can do, e.g., with loadHTML. Then add the elements with appendChild.

Boldewyn
Thanks for the suggestion. Much appreciated.
miCRoSCoPiC_eaRthLinG
+1  A: 

The problem is that you are creating an XML document and SimpleXMLElement creates valid mark-up.

The original HTML tags are not valid XML and thus filtered out.

To create a CDATA section you could try DOMDocument::createCDATASection

Martijn Dijksterhuis
Yep. I figured out the same myself just now.. Thanks for your input :)
miCRoSCoPiC_eaRthLinG
A: 

I believe I've found a solution while wading through the php manual.

DOMDocument has a member method named CreateCDATASection that will help you achieve this, albeit in a tricky manner.

Here's the version of the code posted above using this new method:

<?php
$dom = new DOMDocument( '1.0', 'utf-8' );
$de = $dom->createElement( 'content' );
$dd = $dom->createCDataSection( '<p>some <a>stuff</a></p>' );
$de->appendChild( $dd );
$dom->appendChild( $de );
echo $dom->saveXML();
?>

The output is the desired...

<?xml version="1.0" encoding="utf-8"?>
<content><![CDATA[<p>some <a>stuff</a></p>]]></content>

This will help anyone facing a similar problem to get rolling...

Additional suggestions are most welcome :)

Cheers, m^e

miCRoSCoPiC_eaRthLinG