views:

985

answers:

4

How can one allow code snippets to be entered into an editor (as stackoverflow does) like FCKeditor or any other editor while preventing XSS, SQL injection, and related attacks.

+1  A: 

The best thing that you can do to prevent SQL injection attacks is to make sure that you use parameterized queries or stored procedures when making database calls. Normally, I would also recommend performing some basic input sanitization as well, but since you need to accept code from the user, that might not be an option.

On the other end (when rendering the user's input to the browser), HTML encoding the data will cause any malicious JavaScript or the like to be rendered as literal text rather than executed in the client's browser. Any decent web application server framework should have the capability.

Matt Peterson
A: 

I'd say one could replace all < by &lt;, etc. (using htmlentities on PHP, for example), and then pick the safe tags with some sort of whitelist. The problem is that the whitelist may be a little too strict.

Here is a PHP example

$code = getTheCodeSnippet();
$code = htmlentities($code);
$code = str_ireplace("&lt;br&gt;", "<br>", $code); //example to whitelist <br> tags
//One could also use Regular expressions for these tags

To prevent SQL injections, you could replace all ' and \ chars by an "innofensive" equivalent, like \' and \, so that the following C line

#include <stdio.h>//'); Some SQL command--

Wouldn't have any negative results in the database.

luiscubal
+2  A: 

The same rules apply for protection: filter input, escape output.

In the case of input containing code, filtering just means that the string must contain printable characters, and maybe you have a length limit.

When storing text into the database, either use query parameters, or else escape the string to ensure you don't have characters that create SQL injection vulnerabilities. Code may contain more symbols and non-alpha characters, but the ones you have to watch out for with respect to SQL injection are the same as for normal text.

Don't try to duplicate the correct escaping function. Most database libraries already contain a function that does correct escaping for all characters that need escaping (e.g. this may be database-specific). It should also handle special issues with character sets. Just use the function provided by your library.

I don't understand why people say "use stored procedures!" Stored procs give no special protection against SQL injection. If you interpolate unescaped values into SQL strings and execute the result, this is vulnerable to SQL injection. It doesn't matter if you are doing it in application code versus in a stored proc.

When outputting to the web presentation, escape HTML-special characters, just as you would with any text.

Bill Karwin
The vulnerability of stored procedures may depend somewhat on platform choice. ColdFusion as an example has a cfstoredproc tag which internally handles all the necessary escaping using the libraries provided for the target database, making them safe from injection on that platform.
Isaac Dealey
Right, you can use CFSQLTYPE to filter parameters according to a SQL type, but this still doesn't guarantee against SQL injection when you interpolate a string into a dynamic SQL query, e.g. "SELECT ... ORDER BY @columnName"
Bill Karwin
Oh I thought most databases required the @columnName in that example to be atomic (a singular column name) and would err otherwise. I didn't realize there were db's that would allow injection attacks through a variable like that - except maybe via sp_executeSQL or "exec @columnName"
Isaac Dealey
SQL allows parameters to be used only in place of literal values. But if you're trying to make something like a table name or column name dynamic (as many people do), you're just concatenating strings, prior to sp_executeSQL.
Bill Karwin
Ahh. I guess it hadn't occurred to me that using sp_executeSQL for that is a common practice when using stored procedures for data access -- maybe because I haven't used sp's much in recent years. But you're right, using them that way does make them vulnerable.
Isaac Dealey
+2  A: 

Part of the problem here is that you want to allow certain kinds of HTML, right? Links for example. But you need to sanitize out just those HTML tags that might contain XSS attacks like script tags or for that matter even event handler attributes or an href or other attribute starting with "javascript:". And so a complete answer to your question needs to be something more sophisticated than "replace special characters" because that won't allow links.

Preventing SQL injection may be somewhat dependent upon your platform choice. My preferred web platform has a built-in syntax for parameterizing queries that will mostly prevent SQL-Injection (called cfqueryparam). If you're using PHP and MySQL there is a similar native mysql_escape() function. (I'm not sure the PHP function technically creates a parameterized query, but it's worked well for me in preventing sql-injection attempts thus far since I've seen a few that were safely stored in the db.)

On the XSS protection, I used to use regular expressions to sanitize input for this kind of reason, but have since moved away from that method because of the difficulty involved in both allowing things like links while also removing the dangerous code. What I've moved to as an alternative is XSLT. Again, how you execute an XSL transformation may vary dependent upon your platform. I wrote an article for the ColdFusion Developer's Journal a while ago about how to do this, which includes both a boilerplate XSL sheet you can use and shows how to make it work with CF using the native XmlTransform() function.

The reason why I've chosen to move to XSLT for this is two fold.

First validating that the input is well-formed XML eliminates the possibility of an XSS attack using certain string-concatenation tricks.

Second it's then easier to manipulate the XHTML packet using XSL and XPath selectors than it is with regular expressions because they're designed specifically to work with a structured XML document, compared to regular expressions which were designed for raw string-manipulation. So it's a lot cleaner and easier, I'm less likely to make mistakes and if I do find that I've made a mistake, it's easier to fix.

Also when I tested them I found that WYSIWYG editors like CKEditor (he removed the F) preserve well-formed XML, so you shouldn't have to worry about that as a potential issue.

Isaac Dealey