views:

63

answers:

2

I have XML documents like:

<rootelement>
<myelement>test1</myelement>
<myelement>test2</myelement>
<myelement type='specific'>test3</myelement>
</rootelement>

I'd like to retrieve the specific myelement, and if it's not present, then the first one. So I write:

/rootelement/myelement[@type='specific' or position()=1]

The XPath spec states about the 'or expression' that:

The right operand is not evaluated if the left operand evaluates to true

The problem is that libxml2-2.6.26 seems to apply the union of both expressions, returning a "2 Node Set" (for example using xmllint --shell).

Is it libxml2 or am I doing anything wrong ?

+2  A: 

Short answer: your selector doesn't express what you think it does.


The or operator is a union.

The part of the spec you quoted ("The right operand is not evaluated...") is part of standard boolean logic short circuiting.

Here's why you get a 2-node set for your example input: XPath looks at every myelement that's a child of rootelement, and applies the [@type='specific' or position()=1] part to each such node to determine whether or not it matches the selector.

  1. <myelement>test1</myelement> does not match @type='specific', but it does match position()=1, so it matches the whole selector.
  2. <myelement>test2</myelement> does not match @type='specific', and it also does not match position()=1, so it does not match the whole selector.
  3. <myelement type='specific'>test3</myelement> matches @type='specific' (so XPath does not have to test its position - that's the short-circuiting part) so it matches the whole selector.

The first and last <myelement>s match the whole selector, so it returns a 2-node set.

The easiest way to select elements the way you want to is to do it in two steps. Here's the pseudocode (I don't know what context you're actually using XPath in, and I'm not that familiar with writing XPath-syntax selectors):

  1. Select elements that match /rootelement/myelement[@type='specific']
  2. If elements is empty, select elements that match /rootelement/myelement[position()=1]
Matt Ball
@Bears will eat you: +1 for very good explanation!
Alejandro
Thanks!` `` `` `
Matt Ball
+1  A: 

@Bears-will-eat-you explained very well the cause of your problem.

Here is an XPath one-liner selecting exactly what you want:

/*/myelement[@type='specific'] | /*[not(myelement[@type='specific'])]/myelement[1] 
Dimitre Novatchev
Dimitre, there seems to be a missing closing ')' in your expression. Your expression selects exactly the same node as mine, namely test1 and test3, which is not my intent.BTW, I also tried (/rootelement/myelement[@type='specific' or position()=1])[1], which gives me test1... not better.
foudil
@foudfou: thanks for noticing this. I fixed it now, so try once more. :)
Dimitre Novatchev