Generally speaking, if you're only going to be using the source document's tree once, you're not going to gain much of anything by deserializing it into some specialized object model. The cost of admission - parsing the XML - is likely to dwarf the cost of using it, and any increase in performance that you get from representing the parsed XML in something more efficient than an XML node tree is going to be marginal.
If you're using the data in the source document over and over again, though, it can make a lot of sense to parse that data into some more efficiently-accessible structure. This is why XSLT has the xsl:key
element and key()
function: looking an XML node up in a hash table can be so much faster than performing a linear search on a list of XML nodes that it was worth putting the capability into the language.
To address your specific example, iterating over a List<Thing>
is going to perform at the same speed as iterating over a List<XmlNode>
. What will make the XSLT slower is not the iteration. It's the searching, and what you do with the found nodes. Executing the XPath query Things/Thing
iterates through the child elements of the current node, does a string comparison to check each element's name, and if the element matches, it iterates through that element's child nodes and does another string comparison for each. (Actually, I don't know for a fact that it's doing a string comparison. For all I know, the XSLT processor has hashed the names in the source document and the XPath and is doing integer comparisons of hash values.) That's the expensive part of the operation, not the actual iteration over the resulting node set.
Additionally, most anything that you do with the resulting nodes in XSLT is going to involve linear searches through a node set. Accessing an object's property in C# doesn't. Accessing MyThing.MyProperty
is going to be faster than getting at it via <xsl:value-of select='MyProperty'/>
.
Generally, that doesn't matter, because parsing XML is expensive whether you deserialize it into a custom object model or an XmlDocument
. But there's another case in which it may be relevant: if the source document is very large, and you only need a small part of it.
When you use XSLT, you essentially traverse the XML twice. First you create the source document tree in memory, and then the transform processes this tree. If you have to execute some kind of nasty XPath like //*[@some-attribute='some-value']
to find 200 elements in a million-element document, you're basically visiting each of those million nodes twice.
That's a scenario where it can be worth using an XmlReader
instead of XSLT (or before you use XSLT). You can implement a method that traverses the stream of XML and tests each element to see if it's of interest, creating a source tree that contains only your interesting nodes. Or, if you want to get really crazy, you can implement a subclass of XmlReader
that skips over uninteresting nodes, and pass that as the input to XslCompiledTemplate.Transform()
. (I suspect, though, that if you knew enough about how XmlReader
works to subclass it you probably wouldn't have needed to ask this question in the first place.) This approach allows you to visit 1,000,200 nodes instead of 2,000,000. It's also a king-hell pain in the ass, but sometimes art demands sacrifice from the artist.