views:

267

answers:

1
+2  Q: 

HTML Parser

Anyone know of an HTML parser for VB.NET or C#? I know .NET has a lot of XML support, like XMLReader and XMLWriter. Is there an HTMLWriter or HTMLReader?

Ultimately what I'd like is a library that will parser an HTML file and raise events based on the tags it finds. Anyone know of a library to do this?

+4  A: 

The HTML Agility Pack is the way to go if you want to parse HTML (it even does good job on tag soup). Theoretically, the XML parser included in the BCL should be able to parse valid XHTML, but the HTML Agility Pack is a generic solution that can handle ordinary HTML, XHTML, and messy variants of both.

Raising events when finding tags is something you'll have to implement yourself of course, but it should be fairly trivial using the HtmlReader class.

Noldorin
I've used it in production code and been very pleased.
mkelley33
I also use it in production - works well!
Dror