tags:

views:

123

answers:

1

This is my code:

<?php
$data = <<<EOL
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC
    "-//W3C//DTD XHTML 1.0 Strict//EN" 
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html>
    <script type="text/javascript">
    //<![CDATA[
    var a = 123; // JS code
    //]]>
    </script>
</html>
EOL;

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->formatOutput = false;
$dom->loadXml($data);
echo '<pre>' . htmlspecialchars($dom->saveXML()) . '</pre>';

This is result:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml"&gt;
<script type="text/javascript"><![CDATA[
//]]><![CDATA[
var a = 123; // JS code
//]]><![CDATA[
]]></script></html>

If and when I remove the DOCTYPE notation from XML document, CDATA works properly and leading/trailing double slash is not turned into CDATA.

What is the problem here? Bug in libxml2? PHP version is 5.2.13 on Linux. Thanks.

+1  A: 

I'm running libxml 2.7.3 with PHP 5.2.11 on OS X.

Not an apples to apples comparison but maybe it will help you.

When I run your code (and add the closing PHP tag here is my output.)

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml"&gt;&lt;script type="text/javascript">
    //<![CDATA[
    var a = 123; // JS code
    //]]>
    </script></html>

It appears to render correctly as you want it to. Maybe the version numbers will help you sort it out... I'm running a older version of PHP5 (Mamp incidentally so I didn't compile it myself.)

Hope this helps point you in a direction to find your answer.

Take care!

jfgrissom
Thanks for your comment. My version of libxml is 2.6.16. Maybe this is the problem...
Vincenzo
It is unnecessary and inadvisable to include the closing PHP tag in files that are purely PHP. By including this tag, you are susceptible to already-sent-header errors due to unintended whitespace after any closing tag.
erisco