views:

264

answers:

3

I have a piece of JavaScript string, coming from an untrusted source, embedded inside of an onclick tag and I'm not sure what the correct way of encoding this string is. Here is a simplification of the HTML:

<input type="button" onclick="alert([ENCODED STRING HERE]);"
    value="Click me" />

I use the Microsoft AntiXss library which contains several methods to encode with. The text is embedded in a HTML / XML attribute, so XML attribute encoding, using the AntiXss.XmlAttributeEncode method seems appropriate. However, it is also a piece of JavaScript. Therefore JavaScript encoding using the the AntiXss.JavascriptEncode method seems appropriate too.

Which one should I choose in such a way that I don’t expose a security leak, while allowing the text to be displayed correctly?


UPDATE: The workaround I currently use is by using XmlAttributeEncode on this text and put this inside a custom attribute in the tag. After that I use some JavaScript to read it from this tag. It basically looks like this:

<input type="button" onclick="alert(this.getAttribute('comment');"
    value="Click me" comment="[XML ATTRIBUTE ENCODED TEXT HERE]" />

While this works perfectly and solves the problem, I'm still very curious about how to correctly encode JavaScript inside an XML attribute.

+2  A: 

Install the onclick handler in a separate <script> tag.

<input type="button" id="clickMeButton" value="Click me" />

...

<script type="text/javascript">
...
document.getElementById('clickMeButton').onclick = function () {
   alert([ENCODED STRING HERE using AntiXss.JavascriptEncode]);
}
...
</script>
KennyTM
+2  A: 

Maybe you should try a base64 encoding. It won't contain invalid data in your html (as soon as you place the encoded string in single quotes) and you can decode it with javascript.

Zsolti
+2  A: 

The correct answer is to double encode the text. First with JavascriptEncode and next with XmlAttributeEncode. The rationale behind this is that everything within a xml/html attribute should be XML attribute encoded. The parser of the browser will interpret this as an xml attribute and decode it that way. The browser will supply this decoded text to the javascript interpreter and it should therefore be JavaScript encoded properly to prevent a security leak.

This double encoding will not result invalid results, because the browser will also double decode this text (because two separate interpreters are involved). Here is an example of the correct encoding.

string unsafeText = "Hello <unsafe> ');alert('xss');alert('";
string javaEncoded = AntiXss.JavascriptEncode(unsafeText, false);
ENCODED_STRING = AntiXss.XmlAttributeEncode(javaEncoded);

<input type="button" onclick="alert('[ENCODED_STRING]');"
    value="Click me" />

While double encoding is the only correct way to do this, I like to note that using only JavaScript encoding will usually yield correct result. The constraint here is that the attribute's text is put between quotes.

JavaScript encoding uses the same white list (except for the space character) as HTML/XML attribute encoding. Difference between them is how unsafe characters are encoded. Javascript encodes them as \xXX and \uXXXX (such as \u01A3), while XML attribute encodes them as &#XX; and &#XXXX; (such as &#01A3;). When encoding text with JavaScript encoding, there are only two characters left that will be encoded again by the XML attribute encoder, namely the space character and the backslash character. Those two characters would only be form a problem when the attribute’s text isn’t wrapped between quotes.

Note however that only using XML attribute encoding in this scenario will NOT yield correct result.

Steven