views:

406

answers:

6

I'm currently using innerHTML to retrieve the contents of an HTML element and I've discovered that in some browsers it doesn't return exactly what is in the source.

For example, using innerHTML in Firefox on the following line:

<div id="test"><strong>Bold text</strong></strong></div>

Will return:

<strong>Bold text</strong>

In IE, it returns the original string, with two closing strong tags. I'm assuming in most cases it's not a problem (and may be a benefit) that Firefox cleans up the incorrect code. However, for what I'm trying to accomplish, I need the exact code as it appears in the original HTML source.

Is this at all possible? Is there another Javascript function I can us?

A: 

innerTEXT ? or does that have the same eeffect?

PurplePilot
innerText only allows/supports plain text. And therefore the opposite result as desired
Veger
innerText works only for IE not in Firefox or other browsers so it's kind of useless on its own
Andris
A: 

You must use innerXML property. It does exactly what you want to achieve.

alemjerus
+3  A: 

I don't think you can receive incorrect HTML code in modern browsers. And it's right behaviour, because you don't have source of dynamicly generated HTML. For example Firefox' innerHTML returns part of DOM tree represented in string. Not an HTML source. And this is not a problem because second </strong> tag is ignored by the browser anyway.

Ivan Nevostruev
Thanks for the explanation. Looks like there's no way to get exactly what I'll need. Time to go into hackland.
Ian Silber
+2  A: 

innerHTML is generated not from the actual source of the document ie. the HTML file but is derived from the DOM object that is rendered by the browser. So if IE somehow shows you incorrect HTML code then it's probably some kind of bug. There is no such method to retrieve the invalid HTML code in every browser.

Andris
+1  A: 

You can't in general get the original invalid HTML for the reasons Ivan and Andris said.

IE is also “fixing” your code just like Firefox does, albeit in a way you don't notice on serialisation, by creating an Element node with the tagName /strong to correspond to the bogus end-tag. There is no guarantee at all that IE will happen to preserve other invalid markup structures through a parse/serialise cycle.

In fact even for valid code the output of innerHTML won't be exactly the same as the input. Attribute order isn't maintained, tagName case isn't maintained (IE gives you <STRONG>), whitespace is various places is lost, entity references aren't maintained, and so on. If you “need the exact code”, you will have to keep a copy of the exact code, for example in a JavaScript variable in a <script> block written after the content in question.

bobince
A: 

If you don't need the HTML to render (e.g., you're going to use it as a JS template or something) you can put it in a textarea and retrieve the contents with innerHTML.

<textarea id="myTemplate"><div id="test"><strong>Bold text</strong></strong></div></textarea>

And then:

$('#myTemplate').html() === '<div id="test"><strong>Bold text</strong></strong></div>'

Other than that, the browser gets to decide how to interpret the HTML and it will only return you it's interpretation, not the original.

noah