views:

507

answers:

3

I have done XML parsing before but never on a massive scale. If I'm working with many documents similar to this format:

<?xml version="1.0" ?>
<items comment="something...">
  <uid>6523453</uid>
  <uid>94593453</uid>
</items>

What is the fastest way to parse these documents?
1) XML DOM
2) XML Serialize - Rehydrate to a .NET Object
3) Some other method

UPDATE
I forgot to mention that there would be approx 8000 uid elements on average.

+1  A: 

Using XmlReader is definitely going to be the quickest method, though you'll have to do all the parsing manually of course. It reads directly from the stream without caching anything, though it's not too convenient to use compared to the DOM.

Comparing the two you suggested: serialisation ought to be quicker than using the DOM since (I believe) it doesn't cache the entire tree within memory - it also certainly has an easier to use interface, if you're specifically aiming to perform serialisation.

Noldorin
+1  A: 

I would say that Xml serialization would be the best of both worlds. You get ease of use, as well as good speed. There is some additional overhead with xml serialization...however if you used XmlReader manually, you will at least replicate, if not surpass, that overhead on your own as you use that reader to recreate your object graph.

jrista
+1  A: 

Depending on what you need to do with the data the XmlReader mentioned by @Noldorin is your best bet for streaming style processing. If you need more ad-hoc style access to the data using XPath and the XPathDocument will be much faster than the raw XML document.

http://msdn.microsoft.com/en-us/library/eh3exdc4.aspx

Paul Alexander
This is the real answer: it depends. It depends on what you want to do with the data. If you're doing "XML" things (like XPATH queries, XSL Transforms, etc) then you'll want XmlReader. If you need to manipulate the data as objects, then use serialization.
John Saunders