views:

151

answers:

1

I'm building a web form in which administrators on my site can add XML to a textarea and submit it to be stored in a database table, but I'm a little confused as to the best method of parsing the XML.

The PHP script needs to parse the XML and if there are any parse errors it should return the error message and line/column where the parser stopped to the administrator who submitted the form. After parsing it, it needs to access the DOM to run several checks for the existance of nodes and attributes using XPath.

If I use *xml_create_parser()* and *xml_parse()*, I can get the detailed error information that I'm after if false is returned. However, I can't access the DOM of the XML after I parse it. If I use DOMDocument::loadXML(), from what I've read, it doesn't throw exceptions for parse errors, it just outputs them to the PHP log.

Would it be a great performance hit if I first tried *xml_parse()* and then if that's successful, run DOMDocument::loadXML(), considering the files are mostly smaller than 10KB with a few being 10-20KB? Maybe someone knows a better way?

+3  A: 

You can enable libxml_use_internal_errors and then -if DOMDocument::load() failed- query the detailed error messages with libxml_get_errors()

<?php
$xml = '<a>
  <b>xyz</b>
  <c>
</a>';

libxml_use_internal_errors(true);
$doc = new DOMDocument;
if ( !$doc->loadxml($xml) ) {
  $errors = libxml_get_errors();
  var_dump($errors);
}
VolkerK
Thanks. Whenever I searched, I just found people setting their own error handlers and I didn't see a way to get the line/column numbers of the error that way.
Andy E