ansaurus

Question

How to define an xpath expression that only retrieves hyphenated elements from the first of two similar divs?

Answer 1

+2 A:

Your XPATH was matching on any font element that is a descendant of <div class="top-container">.

div[1] will address the first div child element of the "top-container" element. If you add that to your XPATH, it will return the desired results.

//div[contains(concat(' ',@class,' '),' top-container '))]/div[1]//font/text()

If you want to ensure that only text() nodes that contain "-" are addressed, then you should also add a predicate filter to the text().

//div[contains(concat(' ',@class,' '),' top-container '))]/div[1]//font/text()[contains(.,'-')]

Instead of checking only for nodes that contain "-", how would you modify the last expression to just check for non-empty strings?

If you want to return any text() node with a value, then the predicate filter on text() is not necessary. If a text node doesn't have content, then it isn't a text node and won't be selected.

However, if you only want to select text() nodes that contain text other than whitespace, you could use this expression:

//div[contains(concat(' ',@class,' '),' top-container '))]/div[1]//font/text()[normalize-space()]

normalize-space() removes any leading and trailing whitespace characters. So, if the text() only contained whitespace(including  ), the result would be nothing and evaluate to false() in the predicate filter, so only text() containing something other than whitespace will be selected.

Mads Hansen 2010-10-31 13:17:05

Thanks. This is great. Instead of checking only for nodes that contain "-", how would you modify the last expression to just check for non-empty strings?

August 2010-10-31 13:44:30

Great answer. This is awesome! Thanks so much!

August 2010-10-31 14:13:52

Boolean value of a string is true only if it's not an empty string. So, `text()[normalize-string()]` is enough for select not white space only text nodes. Also, if `font` elements contains only a text node, then `font[contains(.,'-')]` is enough for select `font` elements having `-` character in their string value. At last if you really want to test a `@class` use `contains(concat(' ',@class,' '),' class-to-test ')`.

Alejandro 2010-10-31 20:00:27

Good points @Alejandro. I've updated the answer to use the simplified predicate filter for non-whitespace `text()`, and more safe match for `@class` value

Mads Hansen 2010-10-31 20:33:39

Also +1 for a good answer.

Alejandro 2010-10-31 20:35:56

ansaurus

tags:

views:

answers:

How to define an xpath expression that only retrieves hyphenated elements from the first of two similar divs?

related questions