views:

304

answers:

3

I am trying to populate a DOM element with ID 'myElement'. The content which I'm populating is a mix of text and HTML elements.

Assume following is the content I wish to populate in my DOM element.

var x = "<b>Success</b> is a matter of hard work &luck";

I tried using innerHTML as follows,

document.getElementById("myElement").innerHTML=x;

This resulted in chopping off of the last word in my sentence. Apparently, the problem is due to the '&' character present in the last word. I played around with the '&' and innerHTML and following are my observations.

  1. If the last word of the content is less than 10 characters and if it has a '&' character present in it, innerHTML chops off the sentence at '&'.
  2. This problem does not happen in firefox.
  3. If I use innerText the last word is in tact but then all the HTML tags which are part of the content becomes plain text.

I tried populating through jQuery's #html method,

 $("#myElement").html(x);

This approach solves the problem in IE but not in chrome.

How can I insert a HTML content with a last word containing '&' without it being chopped off in all browsers?

Update : 1. I tried html encoding the content which I am trying to insert into the DOM. When I encode the content, the html tags which are part of the content becomes plain string.

For the above mentioned content, I expect the result to be rendered as,

Success is a matter of hard work &luck

but when I encode what I actually get in the rendered page is,

<b>Success</b> is a matter of hard work &luck

A: 

Try using this instead:

var x = "<b>Success</b> is a matter of hard work &amp;luck";

By HTML encoding the ampersand, you are ensuring that there is no ambiguity in what you mean when you write "&luck".

Sam152
+4  A: 

You should replace your & with &amp;.

The & (ampersand) character is used within HTML to represent various special characters. For example, &quot; = ", &lt; = <, etcetera. Now, &luck clearly is not a valid HTML entity (for one it is missing the semicolon). However, various browsers may, due to combinations of error correcting (the semicolon), and the fact that it looks somewhat like an HTML entity (& followed by four characters) try to parse it as such.

Because &luck; is not a valid HTML entity, the original text is lost. Because of this, when using an ampersand in your HTML, always use &amp;.

Update: When this text is entered by a user, it is up to you to escape this character properly. In PHP for example, you would call htmlentities on the text before displaying it to the user. This has the added benefit of filtering out malicious user code such as <script> tags.

Aistina
Mandai
@Mandai, in that case you should encode this data for the user. For example, PHP uses a function called `htmlentities`, which you would use before sending the HTML to the browser.
Aistina
@Aistina in javascript itself we have ways to encode html. The problem as I mentioned before in encoding is the html tags in my content will become plain text. In the example I provided if I html encode the content, the bold tags will not be applied to the text, rather the bold tags will appear as strings in the rendered page.
Mandai
Erm... you're allowing user-submitted content to render as HTML? That's dangerous. Anyway, once you have added text into HTML without escaping it, you have mangled your string and you will never be able to reliably fix it. You need to HTML-escape every piece of text you insert into HTML content at exactly that moment. (Use `htmlspecialchars` in preference to `htmlentities`.)
bobince
+1  A: 

The ampersand is a special character in HTML that indicates the start of a character entity reference or numeric character reference, you need to escape it like so:

var x = "<b>Success</b> is a matter of hard work &amp;luck";
Simon Lieschke