tags:

views:

43

answers:

2

Given the following fragment of html:

<fieldset>
  <legend>My Legend</legend>
  <p>Some text</p>
  Text to capture
</fieldset>

Is there an xpath expression that will return only the 'Text to capture' text node?

Trying

/fieldset/text()
yields three nodes, not just the one I need.

+3  A: 

Assuming what you want is the text node containing non whitespace text :

//fieldset/text()[normalize-space(.)]

If what you want is the last text node, then:

//fieldset/text()[last()]
Steven D. Majewski
The `.` is optional for `normalize-space()`, when no argument given it operates on the context node.
Tomalak
+2  A: 

I recommend you accept Steven D. Majewski's answer, but here is the explanation (text nodes highlighted with square brackets):

<fieldset>[
  ]<legend>My Legend</legend>[
  ]<p>Some text</p>[
  Text to capture
]</fieldset>

so /fieldset/text() returns

  • "\n "
  • "\n "
  • "\n Text to capture\n"

And this is why you want /fieldset/text()[normalize-space()], and you want the result trimmed before use.

Also note that the above is short for /fieldset/text()[normalize-space(.) != '']. When normalize-space() returns a non-empty string, the predicate evaluates to true, while the empty string evaluates to false.

Tomalak