




I got som difficulties to output "HTML-string code" through XML.I have presented a example below. In the server-side I have some code written in PHP.

$htmlCode = "<div>...........................</div>";

header("Content-type: text/xml");

echo "<?xml version='1.0' encoding='ISO-8859-1'?>";
echo "<info>";
echo "<htmlCode>";
echo $htmlCode;
echo "</htmlCode>";
echo "</info>";

The problem lies in that "HTML string code" or $htmlCode above has tag elements, so the "HTML string codes" will be treated as XML code. And I want the output to be treated as a string.

And in the clientside I have a "AJAX call" to retrieve the string of HTML code.

document.getElementById('someID').innerHTML=xmlhttp.responseXML.getElementsByTagName("htmlCode")[0].childNodes[0].nodeValue;//I got nothing because the string is treated as XML code.

How do I solve this problem? I hope I have been specific enough for you to understand my problem.

+5  A: 

You are looking for CDATA.

The term CDATA is used about text data that should not be parsed by the XML parser.

Everything inside a CDATA section is ignored by the parser.

A CDATA section starts with <![CDATA[ and ends with ]]>:

// escape closing tags
$htmlCode = str_replace("]]>", "<![CDATA[]]]]><![CDATA[>]]>", $htmlCode);

echo "<?xml version='1.0' encoding='ISO-8859-1'?>";
echo "<info>";
echo "<htmlCode>";
echo "<![CDATA[".$htmlCode."]]>";
echo "</htmlCode>";
echo "</info>";

Added escaping fix from here

+1 Beat me to the punch.
Biff MaGriff
thanks for the answer. really appreciate it!
This is not watertight. If the HTML data contains the sequence `]]>` (which is quite valid to have in HTML), it will end the CDATA section prematurely.
@bobince good point, fixed.
Hi Pekka! I used the first answer you gave me and it worked after some time without ]]>. I dont know how I did it, but it worked at least. I was about to post a question of how to remove "]]>"
+1  A: 

Use htmlspecialchars(). Although this function is named after HTML, it is also quite acceptable for XML. (htmlentities() wouldn't be, but you almost never want to use that one anyway.)

    $htmlCode = "<div>...........................</div>";

    header("Content-type: text/xml");
    echo '<?xml encoding="ISO-8859-1"?>'; // really? sure?
    <htmlCode><?php echo htmlspecialchars($htmlCode); ?></htmlCode>

Using a CDATASection is also OK, but given that you need to escape the sequence ]]> in that case, there's really not much advantage over XML-encoding.

+3  A: 

Don't build XML from strings. Just don't. There are ready-to-use libraries that do the right thing. PHP's DOM implementation is one of them.

$myHtmlString = "<div>Some HTML</div>";

$xml  = new DOMDocument('1.0', 'utf-8');

$info = $xml->createElement('info');

$htmlCode = $xml->createElement('htmlCode', $myHtmlString);

echo $xml->saveXML();

This seems like the more complicated approach, but in fact this makes sure your XML is correct. In contrast, just throwing a few strings together will go wrong at some point.

+1 for the obviously *real* right answer :)
Using DOM class is the best way to handle XML. However, I think it'd be even better if the example have the string $myHtmlString loaded as a XML piece.