Why html agility pack is used to parse the information from the html file ? Is not there inbuilt or native library in the .net to parse the information from the html file ? If there then what is the problem with inbuilt support ? What the benefits of using html agility pack versus inbuilt support for parsing information from the html file ?
views:
35answers:
2
+2
A:
There is no html parser in the BCL, which is why the HTML Agility Pack is recommended by so many.
Oded
2010-05-27 10:18:51
@Oded, Is there any xml parsing library there, if yes can not we use it for html parsing ?
Harikrishna
2010-05-27 10:24:33
@Harikrishna - There is `XmlDocument` in the `System.Xml` namespace, but HTML is **not** XML. If you have an XHtml document, you can try and parse it with `XmlDocument`.
Oded
2010-05-27 10:32:54
A:
In one of my applications, I have an HTML template saved in an HTML file. I load it and replace some nodes markers with the values. In this cases I do use .NET XMLDocuments and it works fine. At least in this controlled environment. I don't know what would happen if I tried to parse malformed HTML's.
This is a sample of my code:
Dim S as String = System.IO.File.ReadAllText("Mytemplate.html")
Dim dXML As New System.Xml.XmlDocument
dXML.LoadXml(S)
Dim N As System.Xml.XmlNode
N = dXML.SelectSingleNode("descendant::NodeToFind")
N.InnerText = "Text inside the node"
As I say, this works fine, but if you want to do something more specific to HTML, I guess it would be a good idea to use the HTML agility pack.
ACB
2010-05-27 23:27:44