I'm trying to extract some info from a table based website with hpricot. I get the XPath with FireBug.
This doesn't work... Apparently, the FireBug's XPath, is the path of the rendered HTML, and no the actual HTML from the site. I read that removing tbody may resolve the problem.
I try with:
And still doesn't work... I do a little more research, and some people report they get their XPath removing the numbers, so I try this:
Still no luck...
So I decide to do it step by step like this:
(doc/"html/body/div/table/tr").each do |aaa |
(aaa/"td").each do | bbb|
pp bbb
(bbb/"table/tr").each do | ccc|
pp ccc
I find the info I need in bbb, but not in ccc.
What am I doing wrong, or is there better tool to scrap HTML with long/complex XPath.