views:

46

answers:

1

I am kind of repeating this question because mostly due to my own ignorance, I could not fully understand the innards.

Given this HTML snippet

<td valign=top class="tim_new">
  <a href="/stocks/company_info/pricechart.php?sc_did=MI42" class="tim_new">3M India</a>
</td>
<td class="tim_new" valign=top>
  <a href='/stocks/marketstats/indcomp.php?optex=NSE&indcode=Diversified' class=tim>Diversified</a>
</td>

How does this XPATH //a[@class='tim_new'] differentiate between line 1 and line 2.

+2  A: 

Break down your XPath:

// - This will search anywhere in the XML for a match, instead of looking for an explicit "path".

a - This will match all a elements. Therefore your other elements (td in this case) will be ignored.

[@class='tim_new'] - This will match an attribute called class with a vaule of tim_new.

So all together, your XPath will look everywhere in your input XML (HTML in this case) for an a element which has an attribute class with a value of tim_new.

If you wanted to match the td elements instead, you'd use //td[@class='tim_new'].

Graham Clark
So given this logic, it follows that `//a[@class=tim]` will extract/look for lines with similar characteristics like that of second line?
Soham
@Soham: The XPath `//a[@class='tim']` will match the second `a` element in your example HTML.
Graham Clark
Oh, so the attribute value is being matched, and not strings per se. For an example, how do you suggest I extract the hyperlinked word in the second line, here in this case its "Diversified". Mind you there are many other lines in the same XML/HTML file, with `a` element
Soham
So, at first thought, I thought //a will yield something interesting, but it will return everything in that way.
Soham
Okay, I dont know if its possible to extract that using XPATH,but .SELECT(n => n.InnerText); solved it. But would like to hear your thoughts.
Soham
@Soham: You could use `//a[@class='tim']/text()` to get this text.
Graham Clark
Oh brilliant! Why didnt I think about that! Interesting, it gave me some good insight into its flexibility.
Soham
Yes it worked. Thanks for the headsup!
Soham