views:

63

answers:

3

I want to extract all comments below a specific node within an XML document, using PHP. I have tried both the SimpleXML and DOMDocument methods, but I keep getting blank outputs. Is there a way to retrieve comments from within a document without having to resort to Regex?

A: 

If you are using a SAX event driven-parser, the parser should have an event for comments. For example, when using Expat you would implement a handler and set it using:

void XMLCALL
XML_SetCommentHandler(XML_Parser p,
                      XML_CommentHandler cmnt);
anon
Sometimes this isn't obvious. Java's SAX DefaultHandler won't provide a callback for comments. You have to implement an *additional* interface called LiteralHandler. So getting callbacks on comments doesn't happen by default (I don't know if other languages/toolsets work like this)
Brian Agnew
+1  A: 

Do you have access to an XPath API ? XPath allows you to find comments using (e.g.)

//comment()
Brian Agnew
A: 

SimpleXML cannot handle comments, but the DOM extension can. Here's how you can extract all the comments. You just have to adapt the XPath expression to target the node you want.

$doc = new DOMDocument;
$doc->loadXML(
    '<doc>
        <node><!-- First node --></node>
        <node><!-- Second node --></node>
    </doc>'
);

$xpath = new DOMXPath($doc);

foreach ($xpath->query('//comment()') as $comment)
{
    var_dump($comment->textContent);
}
Josh Davis
This totally worked! The trick was the <textContent> property. It was the reason I had been getting blank outputs.Thanks Josh. You rock
Olaseni