Hi
I need to get all the content inside the body tag of a html file using c#, any good and effective ways of doing this?
Hi
I need to get all the content inside the body tag of a html file using c#, any good and effective ways of doing this?
Check out the HTML Agility Pack to do all sorts of HTML manipulation
It gives you an interface somewhat similar to the XmlDocument
XML handling interface:
HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm");
HtmlNode bodyNode = doc.DocumentNode.SelectSingleNode("/html/body");
if(bodyNode != null)
{
// do something
}
Its easy enough to pull the page code into a string, and simply search for the occurrence of the string "<body" and the string "</body", and just do a little math to get your value...
Use XML methods, XPATH (if you want ONLY specified node). For more advanced manipulation with html use HTML Agility pack.