Given the XML
<PSG>
<C id="1">
<N id="2" >
<A id="3" />
<D id="4">
<PP />
</D>
<V id="5" >
<Tn />
<P />
</V>
<N id="6" >
<D id="7" />
<D id="8" />
</N>
<W id="9" />
</N>
</C>
</PSG>
I am using XPathNodeIterators and an XPathNavigator to run thousands of dynamic XPath statements. A simplified example might be
//*[@id]
The thing is that I will only ever need the first match for each of these XPath statements. I'm wondering if I could help performance if I go to the trouble to add position predicates to these thousand or so queries (largely by hand) so they would resemble the following:
(//*[@id])[1]
I'm still fuzzy on exaclty how the position predicates are applied. I would have thought that //*[@id][1]
would have stopped on the first match in the document, but it returns the first match from every new axis in document order, that is elements with IDs 1, 2, 3 and 7. Basically, I don't know enough about the .Net XPath internals to determine whether the extra work would pay off and it's bugging me that I still don't understand exactly why the inclusion of the position predicate would or wouldn't improve performance.
So the question is, "Will using position predicates in this way improve the performance of the query"? I was told that it would, but it seems like all the elements matching the path in prenthesesis would be searched anyway, then all but the first match would be discarded.
Edit: Despite some well intentioned advice that position predicates would improve my XPath performance, there is no difference between an XPath with the position predicate and one without when using the XPathNodeIterator. They each returned in exactly the same time given a huge XML document and a million tries each. I suppose that's the point of the iterator. I hope this helps someone.