views:

78

answers:

1

Hi,

If I have a string that contains the html from a page I just got returned from an HTTP Post, how can I turn that into something that will let me easily traverse the DOM?

I figured HtmlDocument object would make sense, but it has no constructor. Are there any types that allow for easy management of HTML DOM?

Thanks,
Matt

+7  A: 

The HtmlDocument is an instance of a document that is already loaded by a WebBrowser control. Thus no ctor.

Html Agility Pack is by far the best library I have used to this purpose

An example from the codeplex wiki

HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm");
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
 }
 doc.Save("file.htm");

The example shows loading of a file but there are overloads that let you load a string or a stream.

Sky Sanders
Awesome, thanks!
Matt