ansaurus

Question

DOM parser that allows HTML5-style </ in <script> tag

Answer 1

+1 A:

Alan Storm 2010-10-27 01:52:50

Thanks for the pointers. How can I dig down to the contents of the script tag, searching by id?

Adam Backstrom 2010-10-27 02:18:17

It's a standard DOMDocument object. If you're not comfortable with the DOMDocument, then call the saveXML method (as above) and create a SimpleXml object out of it. If you're not comfortable with Simple XML, you should <a href="http://us.php.net/manual/en/simplexml.examples-basic.php">read the manual</a>

Alan Storm 2010-10-27 04:03:06

Added html5lib to [Best Methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662)

Gordon 2010-10-27 08:06:33

@Alan I hit a wall (well, got mildly annoyed) when I couldn't get `$dom->getElementById()` to work on the resulting DOMDocument. I ended up working around the problem, but I'd be interested to know why it fails and if it can be made to work.

Adam Backstrom 2010-10-27 12:29:36

Because DOMDocument is a confusing pile of over engineered poorly document XML processing? For getElementById to work with DOM documents you need to have a DTD that says which attribute name is an ID, or explicitly set which attribute name on an element is an ID. Whenever I have a DOMDocument I save out an XML string to feed into SimpleXML, and then use the xPath functions to get at what I want.

Alan Storm 2010-10-27 19:16:02

@Gordon, thanks!

Alan Storm 2010-10-27 23:01:12

@Adam More info on why your call wasn't working. Sort of went beyond the 600 character limit :) http://alanstorm.com/domdocument_php_stop

Alan Storm 2010-10-27 23:51:01

@Adam no problem. You might also be interested in my answer to [Simplify PHP DOM XML Parsing](http://stackoverflow.com/questions/3405117/simplify-php-dom-xml-parsing-how/3405651#3405651). Also, the id attributes in DOM example in your blog post are not unique, so even if they were proper xml:id attributes, the XML wouldnt be valid.

Gordon 2010-10-28 07:24:16

ansaurus

tags:

views:

answers:

DOM parser that allows HTML5-style </ in <script> tag

DOMDocument

FluentDOM

phpQuery

html5lib

related questions