tags:

views:

45

answers:

2

Hi, a book I'm reading on XML says that to select all nodes in an XML file that have a specific attribute, use the syntax:

//*/@attribute

What I don't understand is why the asterisk is needed. As I understand it, the expression // selects all descendants of the root node. So, wouldn't //@lang, for example, select all descendants of the root node that have an attribute called "lang"? I can't even interpret what the asterisk even means in the above expression (I know the asterisk in general means "all"). If someone could break it down for me I'd really appreciate it.

Thanks

+4  A: 

Hi, a book I'm reading on XML says that to select all nodes in an XML file that have a specific attribute, use the syntax:

//*/@attribute

That's wrong. It will be expanded to:

/descendant-or-self::node()/child::*/attribute::attribute

Meaning: All attribute attributes of any element child of a node being the root document itself or one of its descendats

You need:

/descendant::*[attribute::attribute]

Or the abbreviated form

//*[@attribute]

About the *: formaly is a name test not a node type test. In XPath 1.0 there is no element type test. In XPath 2.0 you have element(). So, why select only elements? Well, it doesn't. The axis have a principal node type, from http://www.w3.org/TR/xpath/#node-tests :

Every axis has a principal node type. If an axis can contain elements, then the principal node type is element; otherwise, it is the type of the nodes that the axis can contain. Thus,

  • For the attribute axis, the principal node type is attribute.
  • For the namespace axis, the principal node type is namespace.
  • For other axes, the principal node type is element.

That's why *,child::*,self::*,descendant::*, etc. selects elements, but @* or attribute::* or namespace::* selects attributes or in scope namespaces.

About predicate (the [@attribute] part): this expression is evaluate with each of the nodes selects by last step. It expects a boolean value for filtering. The boolean value for a node set (this is the result for attribute::attribute) is false for an empty node set, and true otherwise.

Alejandro
@Alejandro. There is *nothing* wrong with `//*/@x` except that it is longer than it could be.
Dimitre Novatchev
+1  A: 

The title of this question is:

XPath expression for selecting all nodes with a common attribute

However nowhere does the text of the question discuss how tho find all nodes that have a common attribute -- so the title may be incorrect.

To find all nodes that have a common attribute named x (BTW, only element-nodes can have attributes), use:

//*[@x]

Use:

//@x

to select all attributes named x in the XML document. This is probably the shortest expression to do so.

There is nothing wrong with:

//*/@x

except that it is slightly longer.

It is a shorthand for:

/descendant-or-self::node()/child::*/attribute::x

and also selects all x attributes in the XML document.

Someone may think that this expression doesn't select the x attribute of the top element in the document. This is a wrong conclusion, because the first location step:

/descendant-or-self::node()

selects every node in the document, including the root (/) itself.

This means that:

/descendant-or-self::node()/child::*

selects every element, including the top element (which is the only child of the root node in a well-formed XML document).

So, when the last location step /@x is finally added, this will select all the x attributes of all nodes selected so far by the first two location steps -- that is all x attributes of all element-nodes in the XML document.

Dimitre Novatchev
+1 Thanks for the clarification. I always enjoy reading your answers.
Garett