tags:

views:

960

answers:

2

What if I had a document that had multiple elements with the same name -- how would I retrieve, for instance, the second element?

<doc>
...
 <element name="same">foo</element>
...
 <element name="same">bar</element>
...
 <element name="same">baz</element>
...
</doc>

I'd expect something like //elem[@name='same'][2] to work.

Additionally, how would I find the second from last element in xpath with a variable number of elements with the same name

+7  A: 

[] has higher priority than // (and "//" is actually only an abbreviation, not an operator). This is so, because according to the XPath 1.0 Spec,

"// is short for /descendant-or-self::node()/"

and later:

"NOTE: The location path //para[1] does not mean the same as the location path /descendant::para[1]. The latter selects the first descendant para element; the former selects all descendant para elements that are the first para children of their parents."

Therefore, the XPath expression:

     //element[@name='same'][2]

means:

Select any element in the document, that is named "element", has an attribute "name" with value "same", and this element is the second such child of its parent.

What you want is:

     (//element[@name='same'])[2]

Note the brackets, which override the higher precedence of [].

Similarly, the last but one such node is selected by the following XPath expression:

     (//element[@name='same'])[last()-1]

Finally, a necessary warning: The use of the "//" abbreviation is very expensive as it causes the whole (sub)tree to be traversed. Whenever the structure of the document is known, it is recommended to use more specific constructs (location paths).

Dimitre Novatchev
+1, thanks for the helpful explanation of why `//para[1]`, i.e. `/descendant-or-self::node()/para[1]` is different from `/descendant::para[1]`. Re: "The use of the "//" abbreviation is very expensive" - I would say "*can be* very expensive, but it depends greatly on the processor." Saxon, for example, is very clever about optimization, and AFAIU often builds keys automatically in order to make "//" expressions speedy.
LarsH
@LarsH: Yes, you are right about Saxon's optimizations, but even these depend on the size of the document -- I don't believe keeping the automatic key index is done with huge documents. Anyway, it is good for people to know this as a rule of thumb and not to rely on any optimizer.
Dimitre Novatchev
@Dimitre: I agree, it's a good rule of thumb to avoid unnecessary "//" when a more specific pattern or expression can be used.
LarsH
A: 

Thanks, Dimitre Novatchev.

wieker