I am using php, mysql with smarty and I places where users can put comments and etc. I've already escaped characters before inserting into database for SQL Injection. What else do I need to do?
The best way to prevent SQL injection is simply not to use dynamic SQL that accepts user input. Instead, pass the input in as parameters; that way it will be strongly typed and can't inject code.
In regards to SQL Injection, escaping is not enough - you should use data access libraries where possible and parameterized queries.
For XSS (cross site scripting), start with html encoding outputted data. Again, anti XSS libraries are your friend.
One current approach is to only allow a very limited number of tags in and sanitize those in the process (whitelist + cleanup).
You'll want to make sure people can't post JavaScript code or scary HTML in their comments. I suggest you disallow anything but very basic markup.
If comments are not supposed to contain any markup, doing a
echo htmlspecialchars($commentText);
should suffice, but it's very crude. Better would be to sanitize all input before even putting it in your database. The PHP strip_tags() function could get you started.
If you want to allow HTML comments, but be safe, you could give HTML Purifier a go.
XSS is mostly about the HTML-escaping(*). Any time you take a string of plain text and put it into an HTML page, whether that text is from the database, directly from user input, from a file, or from somewhere else entirely, you need to escape it.
The minimal HTML escape is to convert all the &
symbols to &
and all the <
symbols to <
. When you're putting something into an attribute value you would also need to escape the quote character being used to delimit the attribute, usually "
to "
. It does no harm to always escape both quotes ("
and the single quote apostrophe '
), and some people also escape >
to >
, though this is only necessary for one corner case in XHTML.
Any good web-oriented language should provide a function to do this for you. For example in PHP it's htmlspecialchars()
:
<p> Hello, <?php htmlspecialchars($name); ?>! </p>
and in Smarty templates it's the escape
modifier:
<p> Hello, {$name|escape:'html'}! </p>
really since HTML-escaping is what you want 95% of the time (it's relatively rare to want to allow raw HTML markup to be included), this should have been the default. Newer templating languages have learned that making HTML-escaping opt-in is a huge mistake that causes endless XSS holes, so HTML-escape by default.
You can make Smarty behave like this by changing the default modifiers to html
. (Don't use htmlall
as they suggest there unless you really know what you're doing, or it'll likely screw up all your non-ASCII characters.)
Whatever you do, don't fall into the common PHP mistake of HTML-escaping or “sanitising” for HTML on the input, before it gets processed or put in the database. This is the wrong place to be performing an output-stage encoding and will give you all sort of problems. If you want to validate your input to make sure it's what the particular application expects, then fine, but weeding out or escaping “special” characters at this stage is inappropriate.
*: Other aspects of XSS are present when (a) you actually want to allow users to post HTML, in which case you have to whittle it down to acceptable elements and attributes, which is a complicated process usually done by a library like HTML Purifier, and even then there have been holes. Alternative, simpler markup schemes may help. And (b) when you allow users to upload files, which is something very difficult to make secure.
You should not modify data that is entered by the user before putting it into the database. The modification should take place as you're outputting it to the website. You don't want to lose the original data.
As you're spitting it out to the website, you want to escape the special characters into HTML codes using something like htmlspecialchars("my output & stuff", ENT_QUOTES, 'UTF-8')
-- make sure to specify the charset you are using. This string will be translated into my output & stuff
for the browser to read.