tags:

views:

301

answers:

1

Hi, I'm trying to use SimpleXML to output a well-formed XHTML document. I'm doing it like this:

$pbDoc = new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;
<html xmlns="http://www.w3.org/1999/xhtml"&gt;
    <head>
     <title>'.$myTitle.'</title>
     <!-- Other headers -->
    </head>
</html>');

After I have created the document, I want to output pretty readable code, so I'm using DOM module like this:

$dom = new DOMDocument();
$dom->loadXML($pbDoc->asXML());
$dom->formatOutput = true;
echo $dom->saveXML();

Now, there are two strange things that bother me, and I wonder whether this behaviour is normal and how to disable it, if possible.

  1. presence of DOCTYPE causes $pbDoc->asXML() to add an unneeded meta tag right after the opening <head> tag:

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    

    Why?

  2. for some reason, DOM module doesn't indent code at all for me (though it does that very well with a different document, which is XML, and not XHTML).

Could anyone enlighten me about where I may be wrong and how to get rid of these annoyances?

+1  A: 

1. According to the best that I can search and guess, SimpleXML automatically spawns that tag into the HTML because the XML starts with <html>

2. You may want to try this (http://pt2.php.net/manual/en/domdocument.loadxml.php#73933):

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadXML($pbDoc->asXML());
$dom->formatOutput = true;
echo $dom->saveXML();

3. For the last thing (how to get rid of it), I guess you can do a simple str_replace() on the outputted XML. So your code would become:

<?php

// Define the $pbDoc here

$dom = new DOMDocument();
$dom->loadXML($pbDoc->asXML());
$dom->formatOutput = true;
echo str_replace('<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />','',$dom->saveXML());

?>
Pedro Cunha
I think I tried #2, and it didn't help. Perhaps it would help, if I removed the <meta> before passing the string to $dom->loadXML, but I'm not sure.Anyway, (for a different reason) it now seems that SimpleXML is not sufficient for me, so I'm going to rewrite my app to use DOM instead. Thanks for the answer though! :)
Rimas Kudelis