tags:

views:

47

answers:

3

I have the following HTML snippet, http://paste.enzotools.org/show/1209/ , and I want to extract the tag that has a text() descendant with the value of "172.80" (it's the fourth node from that snippet). My attempts so far have been:

'descendant::td[@class="roomPrice figure" and contains(descendant::text(), "172.80")]'
'descendant::td[@class="roomPrice figure" and contains(div/text(), "172.80")]'
'descendant::td[@class="roomPrice figure" and div[contains(text(), "172.80")]]'

but neither of them selects anything. Does anyone have any suggestions?

+1  A: 

When passing node set to function calls, do note that if the function signature doesn't declare a node set argument then it will cast the the first node from that node set.

So, I think you need this XPath expression:

descendant::td[@class="roomPrice figure"][div[text()[contains(.,'172.80')]]]

Test for a text node child of div

or

descendant::td[@class="roomPrice figure"]
              [div[descendant::text()[contains(.,'172.80')]]]

Test for a text node descendant of div

or

descendant::td[@class="roomPrice figure"]
              [descendant::text()[contains(.,'172.80')]]

Test for a text node descendat of td

Alejandro
+1 beat me to it. :-) (Note typo in `descendat`.)
LarsH
@LarsH: Thanks for notice that. Now it's correct.
Alejandro
Thank you. Your solution works.
LucianU
@LucianU: You are wellcome.
Alejandro
A: 

I believe you want something like this:

<xsl:for-each select="//td[contains(string(.), '172.80')]">

The string() function will give you all the text in the current and descendant nodes wherease text() just gives you the text in the current (context) node.

Of course, you extend the xpath selector to filter on the class names too...

<xsl:for-each select="//td[contains(string(.), '172.80')][@class='roomPrice figure']">

And as stated in the comments above, you're posted xml/html is invalid as it stands.

Catch22
That's one way to do it. Note that the explicit `string(.)` is redundant, as the first argument will get converted to a string implicitly. The only drawback is that every td could be converted to a string this way, which would involve a lot of unnecessary string concatenation to build strings that will be thrown away. But that may not be a problem for small web pages.
LarsH
A: 

My understanding is that you want to select the td element in specified class, that has a descendant text node containing the value "172.80".

I'm assuming the context node is the <tr> (or some ancestor of it).

The attempts you listed all suffer from the problem that contains() converts its first argument to a single string, using only the first node of the nodeset. So if the td or div has a descendant or child text node before the one that contains "172.80", the one containing "172.80" will not be noticed.

Try this:

'descendant::td[@class="roomPrice figure" and
                descendant::text()[contains(., "172.80")]]'
LarsH
Thank you too. Your solution works as well.
LucianU
@LucianU: you're welcome. You should probably upvote the answers that you think are helpful, and accept one of them.
LarsH
LarsH, I tried upvoting, but I don't have enough reputation for that. Btw, thank you also for explaining the problem clearly. I understand now what was wrong and hopefully won't repeat the mistake.
LucianU
@LucianU: ok, no problem. It's tricky, understanding which parts of XPath support "general (nodeset) comparisons" and which parts don't.
LarsH