views:

2026

answers:

6

When you get the innerHTML of a DOM node in IE, if there are no spaces in an attribute value, IE will remove the quotes around it, as demonstrated below:

<html>
    <head>
        <title></title>
    </head>
    <body>
        <div id="div1"><div id="div2"></div></div>
        <script type="text/javascript">
            alert(document.getElementById("div1").innerHTML);
        </script>
    </body>
</html>

In IE, the alert will read:

<DIV id=div2></DIV>

This is a problem, because I am passing this on to a processor that requires valid XHTML, and all attribute values must be quoted. Does anyone know of an easy way to work around this behavior in IE?

+2  A: 

I ran into this exact same problem just over a year ago, and solved it using InnerXHTML, a custom script written by someone far smarter than I am. It's basically a custom version of innerHTML that returns standard markup.

Scottie
Yeah, I've looked at this library, but I would rather not use it because it is licensed under the Creative Commons Attribution-Share Alike 3.0 License.
Augustus
Fair enough. My employer at the time I used it was fine with it, but I understand that's not always the case.
Scottie
A: 

did you tried with jquery ?

alert($('#div1').html());
Houssem
Yup. jQuery uses innerHTML to return its value, and they haven't fixed this problem yet. I took jQuery out just to make it more clear where the problem lies.
Augustus
+9  A: 

IE innerHTML is very annoying indeed. I wrote this function for it, which may be helpfull? It quotes attributes and sets tagnames to lowercase. By the way, to make it even more annoying, IE's innerHTML doesn't remove quotes from non standard attributes.

Edit based on comments The function now processes more characters in attribute values and optionally converts attribute values to lower case. The function looks even more ugly now ;~). If you want to add or remove characters to the equation, edit the [a-zA-Z\.\:\[\]_\(\)\&\$\%#\@\!0-9]+[?\s+|?>] part of the regular expressions.

function ieInnerHTML(obj, convertToLowerCase) {
 var zz = obj.innerHTML
     ,z = zz.match(/<\/?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)\/?>/g);

  if (z){
    for (var i=0;i<z.length;i++){
      var y
          , zSaved = z[i]
          , attrRE = /\=[a-zA-Z\.\:\[\]_\(\)\&\$\%#\@\!0-9]+[?\s+|?>]/g;
      z[i] = z[i]
              .replace(/(<?\w+)|(<\/?\w+)\s/,function(a){return a.toLowerCase();});
      y = z[i].match(attrRE);//deze match

       if (y){
        var j = 0
            , len = y.length
        while(j<len){
          var replaceRE = /(\=)([a-zA-Z\.\:\[\]_\(\)\&\$\%#\@\!0-9]+)?([\s+|?>])/g
              , replacer = function(){
                  var args = Array.prototype.slice.call(arguments);
                  return '="'+(convertToLowerCase ? args[2].toLowerCase() : args[2])+'"'+args[3];
                };
          z[i] = z[i].replace(y[j],y[j].replace(replaceRE,replacer));
          j++;
        }
       }
       zz = zz.replace(zSaved,z[i]);
     }
   }
  return zz;
 }

Example key-value pairs that should work

data-mydata=return[somevalue] => data-mydata="return[somevalue]"
id=DEBUGGED:true => id="DEBUGGED:true" (or id="debugged:true" if you use the convertToLowerCase parameter)
someAttribute=Any.Thing.Goes => someAttribute="Any.Thing.Goes"
KooiInc
I haven't tested this, but I'm going to select it as the accepted answer anyway, because it comes closest to being a self-contained solution to the question.
Augustus
Ok, thanx. In my tests it worked, but didn't test it thourougly. Out of curiosity: who think this answer deserves a -1 score and why?
KooiInc
I remember when I had to work with a huge legacy application that all logic was using innerHTML. I remember the problem it had since in one of the examples, using innerHTML in a tbody tag is readonly (as some other elements).
GmonC
+1 Nicely done. What would be *even nicer* is lowercasing style attributes as well, such as `style="DISPLAY: none"`.
Crescent Fresh
Wow. IE is a really, really stupid browser.
leeand00
Thanks for this Kooilnc. IE is rubbish!!
alpha_juno
Doesn't work with hypens in the attribute name. Can you fix?
mofle
@mofle and @Creshent Fresh: see the modifications I posted
KooiInc
By the way, although anything goes (see examples), as far as I know not all characters are allowed as attribute values for all attributes, so if its valid xhtml you're after, you should be carefull using these values.
KooiInc
Thanks! :) Can you add support for hyphens in the attribute value to?I have a testcase here: http://wroug.com/bugs/ieinnerhtml/test.htmlAlso, it would be useful, if the attribute name is "style", then lowercase the whole value, if converttolowercase is true.
mofle
@mofle: try modifying the regex (add \- between []). As far as the style attribute is concerned: that's beyond te goal of this function. The goal is to convert de unquoted attribute values IE returns from innerHTML to quoted attribute values. Style is untouched by IE's innerHTML, sorry.
KooiInc
I assume you mean like thiszz.match(/<\/?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[\-^'">\s]+))?)+\s*|\s*)\/?>/g); but it doesn't work. Any help? :) About the style, no worries, but it's strange, I have my style values in lowercase, but IE output them in uppercase, anyway, that's an easy fix I can do myself.
mofle
KooiInc
Aah, right. Thanks :D Sorry for the bother. Maybe you should update the function with this. Could be useful for other people too.
mofle
+3  A: 

Ah, the joy of trying to use XHTML in a browser that doesn't support it.

I'd just accept that you are going to get HTML back from the browser and put something in front of your XML processor that can input tag soup and output XHTML — HTML Tidy for example.

David Dorward
This is actually the solution I ended up going with. I opted to use the lxml Python library.
Augustus
+2  A: 

I've tested this, and it works for most attributes, except those that are hyphenated, such as class=day-month-title. It ignores those attributes, and does not quote them.

Also, I had to change the regex, to: /<\/?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^\'\">\s]+))?)+\s*|\s*)\/?>/gbefore it would work.
A: 

No, jQuery no fixed to example this html:

<img src="[THUMB]" border="0" alt="[EPIGRAFE]" />

jQuery return run alert($('#test').html()); in IE: <img src="[THUMB]" border=0 alt=[EPIGRAFE] / >. Bad attr alt for example.

This function ieInnerHTML neither works for [VALUES] in example. One posible solution?

Nicolaspar