views:

4455

answers:

5

What is the best way to parse (get a DOM tree of) a HTML result of XmlHttpRequest in Firefox?

EDIT:

I do not have the DOM tree, I want to acquire it.

XmlHttpRequest's "responseXML" works only when the result is actual XML, so I have only responseText to work with.

The innerHTML hack doesn't seem to work with a complete HTML document (in <html></html>). - turns out it works fine.

+1  A: 

Loop up the responseXML property of the XMLHttpRequest object. Furthermore, if you use innerHTML to append the responseText of an HTML-formatted response, the browser will parse the text and assemble it within the DOM all before even appending it into the document flow.

Andrew Noyes
A: 

When you say "DOM Tree" you've already accomplished "parsing" the data. DOM is a structure for data. If you figure out what DOM "looks like", you'll understand this.

ryansstack
I can't make any sense of this comment. The OP says he's getting some HTML back from Ajax, and wants to turn it into some DOM. If that's not parsing, I don't know what is.
Colin Fine
+1  A: 

If your data is XHTML, so it's valid XML, then DOMParser (Mozilla) or loadXML (IE) may help. If not, I can't think of anything better than stripping the and and then passing it to a 's innerHtml.

See 21.1.3 in Flanagan's Javascript guide (5th edition).

Colin

Colin Fine
+5  A: 

innerHTML should work just fine, e.g.

// This would be after the Ajax request:
var myHTML = XHR.responseText;
var tempDiv = document.createElement('div');
tempDiv.innerHTML = myHTML.replace(/<script(.|\s)*?\/script>/g, '');

// tempDiv now has a DOM structure:
tempDiv.childNodes;
tempDiv.getElementsByTagName('a'); // etc. etc.
J-P
Looks like it's the best I can do. Thanks for the tip about <script>s.
hmp
If you're worried about <script>s being executed, then you'd also need to worry about other tricks such as SCRIPT being uppercase, or a null byte appearing part way through the word <script> etc. But do you really need to be worried about <script>s being executed?
thomasrutter
According to this page: http://bytes.com/topic/javascript/answers/513633-innerhtml-script-tag - you don't need to worry about script blocks being executed when added via innerHTML: "Script blocks inserted via innerHTML don't get executed in any browserother than NS6" - though that was written in 2006.
thomasrutter
A: 

Question for OP hmp and J-P: is it possible to use Ajax to GET a html file? How do you get HTML content in the response? Would it work to rename a html file as .txt?

Thanks!

maccamb
J-P's answer tells you how to get HTML content in the response. To GET a html file, just use XmlHttpRequest as you normally would (or Ajax function in your JavaScript framework of choice). Renaming to .txt shouldn't change anything in this case.
hmp