When parsing a bunch of html in plain text format, is regex the best way to extract and examine all anchor tags or is there anything built into the .net lib?
+1
A:
RegEx is you pal here. There is no HTML parser built into the BCL.
If your input it XHTML (or XML conformant), you can use XML and XPath. Loading the document into a XmlDocument
and selecting all a
nodes.
Oded
2010-01-06 10:19:10
+1
A:
Regex is good. However I find the HTML agility pack to be a little more forgiving and is what I would use in this situation.
Joel Cunningham
2010-01-06 10:22:55