views:

93

answers:

2

I am testing various methods to read (possibly large, and very often) XML configuration files in PHP. No writing is ever needed. I have two successful implementations, one using SimpleXML (which I know is a DOM parser) and one using XMLReader.

I know that a DOM reader must read the whole tree and therefore uses more memory. My tests reflect that. I also know that A SAX parser is an "event-based" parser that uses less memory because it reads each node from the stream without checking what is next.

XMLReader also reads from a stream with the cursor providing data about the node it is currently at. So, it definitely sounds like XMLReader (http://us2.php.net/xmlreader) is not a DOM parser, but my question is, is it a SAX parser, or something else? It seems like XMLReader behaves the way a SAX parser does but does not throw the events themselves (in other words, can you construct a SAX parser with XMLReader?)

If it is something else, does the classification it's in have a name?

+3  A: 

XMLReader calls itself a "pull parser."

The XMLReader extension is an XML Pull parser. The reader acts as a cursor going forward on the document stream and stopping at each node on the way.

It later goes on to say it uses libxml.

This page on Java XML Pull Parsing may be of some possible interest. If XMLReader is related to this project's goals and intent, then the answer to your question falls squarely into the "neither" category.

Charles
+3  A: 

A SAX parser is a parser which implements the SAX API. That is: a given parser is a SAX parser if and only if you can code against it using the SAX API. Same for a DOM parser: this classification is purely about the API it supports, not how that API is implemented. Thus a SAX parser might very well be a DOM parser, too; and hence you cannot be so sure about using less memory or other characteristics.

However to get to the real question: XMLReader seems the better choice because since it is a pull parser you request the data you want quite specifically and therefore there should be less overhead involved.