views:

35

answers:

2

Why html agility pack is used to parse the information from the html file ? Is not there inbuilt or native library in the .net to parse the information from the html file ? If there then what is the problem with inbuilt support ? What the benefits of using html agility pack versus inbuilt support for parsing information from the html file ?

+2  A: 

There is no html parser in the BCL, which is why the HTML Agility Pack is recommended by so many.

Oded
@Oded, Is there any xml parsing library there, if yes can not we use it for html parsing ?
Harikrishna
@Harikrishna - There is `XmlDocument` in the `System.Xml` namespace, but HTML is **not** XML. If you have an XHtml document, you can try and parse it with `XmlDocument`.
Oded
A: 

In one of my applications, I have an HTML template saved in an HTML file. I load it and replace some nodes markers with the values. In this cases I do use .NET XMLDocuments and it works fine. At least in this controlled environment. I don't know what would happen if I tried to parse malformed HTML's.

This is a sample of my code:

Dim S as String = System.IO.File.ReadAllText("Mytemplate.html")

Dim dXML As New System.Xml.XmlDocument
dXML.LoadXml(S)

Dim N As System.Xml.XmlNode
N = dXML.SelectSingleNode("descendant::NodeToFind")

N.InnerText = "Text inside the node"

As I say, this works fine, but if you want to do something more specific to HTML, I guess it would be a good idea to use the HTML agility pack.

ACB