tags:

views:

1182

answers:

5

I need to parse an xml string and find values of specific text nodes, attribute values etc. I'm doing this in javascript and was using the DOMParser class for the same. Later I was informed that DOM is takes up a lot of memory and SAX is a better option.

Recently I found that XPath too provides a simple way to find nodes.

But I'm not sure which amongst these 3 would be the most efficient way to parse XML. Kindly help....

A: 

If you only need to find values of specific text nodes, then XPath. The reason DOM takes up a lot of memory is because it reads in the whole XML and form the tree for the document. SAX is event-based. Hence, based on what you have described, XPath best suits your scenario.

StartClass0830
+13  A: 

SAX is a top-down parser and allows serial access to a XML document, and works well for read only access. DOM on the other hand is more robust - it reads the entire XML document into a tree, and is very efficient when you want to alter, add, remove data in that XML tree. XPath is useful when you only need a couple of values from the XML document, and you know where to find them (you know the path of the data, /root/item/challange/text).

SAX: Time efficient when iterating through the document, gives a single pass for every iteration

DOM: Flexible/performance, gives you more ways to work your data

XPath: Time efficient when you only need to read a couple of values

Björn
+5  A: 

Unless you're using the research prototype of streaming XPath, it is very likely that your XPath engine is loading everything into memory, so it will have similar characteristics to DOM. So it rather depends on your definition of 'efficiency'. It's certainly easier to use, and the XPath implementations could change to be more efficient, whereas DOM will always have some representation of the whole document on the client machine, and SAX will always be a lot more awkward to program than XPath.

Pete Kirkham
I find it odd that the other answers don't mention your point, since XPath still has to parse the document in some way. DOM, SAX and XPath are different APIs for accessing a document; but only DOM and SAX are parsers of a document. Unless some #C does a parser for XPath that we don't know about?
13ren
BTW: your linked XSQ uses SAX for parsing underneath - it doesn't have a specific XPath parser.
13ren
Yes, it's a layer above a streaming parser rather than an object model.
Pete Kirkham
+1  A: 

This document from MSDN provides a wealth of information about optimizing XML processing.

In particular, the XPathDocument class is designed to be more efficient for evaluating XPath expressions than using (the DOM-based) XmlDocument class. The reason is that XPathDocument is a read-only representation of an XML document, while a DOM implementation also covers changing the document.

Using DOM has a not less-important downside that it typically results in complicated, spaghetti-like code that is difficult to understand and maintain.

Dimitre Novatchev