views:

156

answers:

3

Hi, for a certain project, I need some way to parse XML and get data from it. So I wonder, which one of built-in parsers is the fastest?

Also, it would be nice of the parser could accept a XML string as input - I have my own implementation of thread-safe working with files and I don't want some nasty non-thread-safe libraries to make my efforts useless.

+1  A: 

There are not really much parsers in PHP.

The most effective will be those provided with PHP, write a benchmark with DOM and SimpleXML and check which performs better.

Tobias P.
Don't just benchmark, benchmark and publish your test data, test methods, and results!
Charles
+2  A: 

The fastest parser will be SAX -- it doesn't have to create a dom, and it can be done with partial xml, or progressively. Info on the PHP SAX parser (Expat) can be found here. Alternatively there is a libxml based DOM parser named SimpleXML. A DOM based parser will be easier to work with but it is typically a few orders of magnitude slower.

Evan Carroll
+1  A: 

Each XML extension has its own strengths and weaknesses. For example, I have a script that parses the XML data dump from Stack Overflow. The posts.xml file is 2.8GB! For this large XML file, I had to use XMLReader because it reads XML in a streaming mode, instead of trying to load and represent the whole XML document in memory at once, as the DOM extension does.

So you need to be more specific about describing how you are going to use the XML, in order to decide which PHP extension to use.

All of PHP's XML extensions provide some method to read XML data as a string.

Bill Karwin