I want to scrape a list of facts from simple website. Each one of the facts is enclosed in a <li>
tag. How would I do this using Html Agility Pack? Is there a better approach?
The only things enclosed in <li>
tags are the facts and nothing else.
I want to scrape a list of facts from simple website. Each one of the facts is enclosed in a <li>
tag. How would I do this using Html Agility Pack? Is there a better approach?
The only things enclosed in <li>
tags are the facts and nothing else.
Something like:
List<string> facts = new List<string>();
foreach (HtmlNode li in doc.DocumentNode.SelectNodes("//li")) {
facts.Add(li.InnerText);
}
How about a simple regex?
Dim tMatch As Match = Nothing
For Each tMatch In RegEx.Matches("\<li\>(?<Fact>.*?)\<\/li\>", tHTMLString)
Console.WriteLine(tMatch.Groups("Fact").Value)
Next
Note that SelectNodes returns null when "... no node matched the XPath expression".