tags:

views:

46

answers:

4

I have such xpath expression :

link[@rel='alternate' and @type='text/html' or not(@rel)]/@href | link/text()

?

Acctually I don't understand the symbol |

A: 

The pipe ( | ) in XPath combines expressions. So this will return the href attribute for link elements (that match the predicate) and text content of any links

So given a fragment like

  <link>test</link>
  <link href="http://www.google.com"&gt;Google&lt;/link&gt;
  <link rel="zzzz" href="http://www.stackoverflow.com"&gt;Stack Overflow</link>

you'd get:

test
http://www.google.com
Google
Stack Overflow
ChrisCM
+2  A: 

The symbol | is a union. It grabs all elements that match either the left hand side or the right hand side.

What that xpath says is:

  • Grab the href attribute of all link tags that have an attribute "rel=alternate" and an attribute "type=text/html", or grab the href of all link tags that do not have the rel attribute

Also grab (because of the union):

  • Grab the inner text of all link tags on the page.

Kind of a wierd XPath, but that is what it does.

Stargazer712
Thank you very much.
Nikita
A: 

According to XPath Operators at w3schools, it computes two node-sets. This would result in all nodes matching the expression on the left-hand side of the | operator, combined with the nodes matching the expression on the right-hand side.

Fredrik Mörk
A: 

Acctually I don't understand the symbol |

This is the XPath union operator.

As defined in the W3 XPath 2.0 Spec.:

•The union and | operators are equivalent. They take two node sequences as operands and return a sequence containing all the nodes that occur in either of the operands.

Of course, the "union" (english word) operator was only added in XPath 2.0 and in XPath 1.0 we only have its earlier synonym, represented by the | character.

So, in the particular case of:

link[@rel='alternate' and @type='text/html' or not(@rel)]/@href | link/text()

the XPath expression above selects the union of two sets:

  1. All nodes selected by: link[@rel='alternate' and @type='text/html' or not(@rel)]/@href

  2. All nodes, selected by: link/text()

Union is a standard operation in Theory of Sets (and in mathematics), although the sign 'U" is used to represent union there.

To quote the definition from Wikipedia:

The union of two sets A and B is the collection of points which are in A or in B (or in both):

A simple example:

A = {1,2,3,4,5,6}

B = {1,5,6,7,8}


A U B = {1,2,3,4,5,6,7,8}
Dimitre Novatchev