I need to process html submitted in my web application and don't want to munge the whole thing with regular expressions. What tokenizer approach and/or software should I take?
+2
A:
I would use the DOMDocument::loadHTML method to load the HTML document. And if you want a simpler handling than the DOMDocument methods, you can convert it to a SimpleXML object by using simplexml_import_dom()
.
Gumbo
2009-04-09 08:29:22