views:

77

answers:

1

Hello, I'm using AntiSamy with the available antisamy-1.4.1.xml policy. The policy is working nicely to block most XSS attacked but the following below is not being blocked. Any suggestions on how to block the following below to prevent XSS attacks?

1234%27%2Balert%2873918%29%2B%27

Thanks

+3  A: 

Antisamy is an HTML content filter meant for allowing an untrusted user to input a limited subset of ‘safe’ HTML. It is not an all-purpose input filter that can save you from having to think about string escaping and XSS issues.

You should use antisamy only to clean up content that will contain HTML that you wish to output verbatim on a page. Most user input is generally not HTML: when a user types a<b or c>d, they should usually get the literal less-than and greater-than characters and not a bold tag. To ensure this happens correctly, you must HTML-escape all text content that gets inserted into your page at the output stage, instead of anything to do with antisamy.

1234%27%2Balert%2873918%29%2B%27

This looks nothing like a typical HTML injection attack. The only ‘special’ character it contains is an apostrophe, which isn't usually special in HTML, and can't practically be filtered out of input because users do generally need to use apostrophes for writing in English.

If this is causing script injection for your application, you've got bigger problems than anything antisamy can solve. If this is causing your page to pop up an alert() dialogue, you are probably using the value unescaped in a JavaScript string literal, for example something like:

<a href="..." onclick="callfunc('hello <%= somevar %>');">

Putting text content into JavaScript code as a string literal requires another form of escaping; one that turns the ' character (the %27 in the URL-encoded input) into a backslash-escaped \', and \ itself into \\ (as well as a few other replacements).

The easy way to get values (strings or otherwise) from a server-side scripting language into a JavaScript literal is to use a standard JSON encoder.

However, in the above case, the JavaScript string literal is itself contained inside an HTML attribute, so you would have to HTML-encode the results of the JSON encoder. This is a bit ugly; it's best to avoid inline event handler attributes. Use external scripts and <script> elements instead, binding events from JS instead of HTML.

Even in a <script> block, where you don't generally need to HTML-encode, you have to beware of a string </script> (or, generally, anything beginning </, which can end the block). To avoid that sequence you should replace the < character with something else, eg. \x3C. Some JSON encoders may have an option to do this for you to save the trouble.

There are many other places where inserting content into a containing language requires special sorts of encoding. Each has its own rules. You can't avoid the difficulty of string encoding by using a general-purpose input filter. Some “anti-XSS” filters try, but they invariably fail miserably.

bobince
Nice answer. Particularly liked "If this is causing script injection for your application, you've got much bigger problems than anything antisamy can solve." Big +1.
spender
My web app has a global search that posts to search.cfm?q=SEARCHTERM... What can happen is a user can post the string above which allows JavaScript to trigger... A bad person could create a malicious url that when clicked on a client machine would run JavaScript and allow the malicious person to steal all the client's cookies etc... Any ideas on how to prevent this? I thought Antisamy was the way to go but given it's not HTML I now see how that's not what AntiSamy is for.
AnApprentice
Each context into which you can insert text requires a particular form of escaping. To prevent injection attacks, use the correct form of encoding for that context. For including text in HTML content (the most common case), use HTML-encoding. (There isn't a built-in HTML encoder in classic JSP, but in anything remotely modern, you have the JSTL/EL features like `<c:out>` which will do it.) For including text in a JavaString string literal, use a JSON encoder such as json.org's, or avoid the issue by putting the text in HTML (eg a hidden input's `value`) and reading it from static script.
bobince