ansaurus

Question

Answer 1

+1 A:

Ok so it appears that my xpaths have tbody's in them. When I remove these tbodys manually from the xpath, HTMLAgilityPack can handle it fine.

I'd still like to know why I am getting invalid xpaths, but for now I have answered my question.

Saab 2010-10-21 03:58:29

probably related to either the browser or the xpather app, i'm going to check it out sounds interesting.

Anonymous Type 2010-10-21 03:59:33

Answer 2

A:

I think unless my xpath knowledge is heaps flawed(probably) the problem is with the /tbody node in your xpath expression.

When I do

 string test = string.Empty;
StreamReader sr = new StreamReader(@"C:\gs.htm");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.Load(sr);
sr.Close();
sr = null;
string xpath = @"//table[@id='Home']/tr[3]/td";
test = doc.DocumentNode.SelectSingleNode(xpath).InnerText;

That works fine.. returns a
"COLUMBUS BLUE JACKETSGame 5 Home Game 3"
which I hope is the string you wanted.

Examining the html I couldn't find a /tbody.

Anonymous Type 2010-10-21 03:58:48

ansaurus

tags:

views:

answers:

Trouble Scraping .HTM File

related questions