ansaurus

Question

Could the value of an html anchor tag be fetched using xpath?

Answer 1

A:

Why would you use an XML parser to parse HTML? I would suggest using a dedicated Java HTML parser, there are many, but I haven't tried any myself.

As for your question, would it work, I suspect it will not work, you will get an error when trying to parse it as HTML right at &nbs; if not earlier.

hhafez 2010-01-07 05:13:15

Answer 2

+1 A:

To use XPath you usually need XML not HTML, but some parsers (e.g. the one built into PHP) have a relaxed Mode which will parse most HTML, too.
If you want to find all <a> that are direct children of <td class="blah"> the XPath you need is

//td[@class = 'blah']/a
or
//td[@class = 'blah']/a[@href = 'http://...']

(depending on whether you only want the one url or all urls)
This will give you a Set of Nodes. You'll need to iterate through it and then check for the nodeType of the firstChild (supposed to be a text node) and the number of child nodes (supposed to be 1). Then the firstChild will contain the ????

Mene 2010-01-07 16:30:22

ansaurus

tags:

views:

answers:

Could the value of an html anchor tag be fetched using xpath?

related questions