views:

5021

answers:

19

I realize that parameterized SQL queries is the optimal way to sanitize user input when building queries that contain user input, but I'm wondering what is wrong with taking user input and escaping any single quotes and surrounding the whole string with single quotes. Here's the code:

sSanitizedInput = "'" & Replace(sInput, "'", "''") & "'"

Any single-quote the user enters is replaced with double single-quotes, which eliminates the users ability to end the string, so anything else they may type, such as semicolons, percent signs, etc, will all be part of the string and not actually executed as part of the command. We are using Microsoft SQL Server 2000, for which I believe the single-quote is the only string delimiter and the only way to escape the string delimiter, so there is no way to execute anything the user types in.

I don't see any way to launch an SQL injection attack against this, but I realize that if this were as bulletproof as it seems to me someone else would have thought of it already and it would be common practice. My question is this: what's wrong with this code? Does anybody know a way to get an SQL injection attack past this sanitization technique? Sample user input that exploits this technique would be very helpful.

Thanks in advance.

UPDATE:

Thanks to everyone for their answers; pretty much all the information I came across in my research showed up on this page somewhere, which is a sign of the intelligence and skill of the people who have taken time out of their busy days to help me out with this question.

The reason I have not yet accepted any of the answers is that I still don't know of any way to effectively launch a SQL injection attack against this code. A few people suggested that a backslash would escape one single-quote and leave the other to end the string so that the rest of the string would be executed as part of the SQL command, and I realize that this method would work to inject SQL into a mySQL database, but in MS SQL 2000 the only way (that I've been able to find) to escape a single-quote is with another single-qoute; backslashes won't do it. And unless there is a way to stop the escaping of the single-quote, none of the rest of the user input will be executed because it will all be taken as one contiguous string.

I understand that there are better ways to sanitize input but I'm really more interested in learning why the method I provided above won't work. If anyone knows of any specific way to mount a SQL injection attack against this sanitization method I would love to see it.

+1  A: 

It might work, but it seems a little hokey to me. I'd recommend verifing that each string is valid by testing it against a regular expression instead.

unforgiven3
A: 

While you might find a solution that works for strings, for numerical predicates you need to also make sure they're only passing in numbers (simple check is can it be parsed as int/double/decimal?).

It's a lot of extra work.

Joseph Daigle
+4  A: 

It's a bad idea anyway as you seem to know.

What about something like escaping the quote in string like this: \'

Your replace would result in: \''

If the backslash escapes the first quote, then the second quote has ended the string.

WW
Thanks for the response! I know that attack would work for a mySQL database but I'm pretty sure that MS SQL Server won't accept a backslash as an escape character (I tried it). Several google searches did not reveal any other escape characters, which really made me wonder why this wouldn't work.
Patrick
WW, you are probably thinking of source code, not run-time.
DK
+1  A: 

What ugly code all that sanitisation of user input would be! Then the clunky StringBuilder for the SQL statement. The prepared statement method results in much cleaner code, and the SQL Injection benefits are a really nice addition.

Also why reinvent the wheel?

JeeBee
+1  A: 

If that searches for a ' then turns it into a '', what's to stop the user from just entering \' ?

Cetra
Because he said it's MS SQL, not Mysql
AviD
A: 

Rather than changing a single quote to (what looks like) two single quotes, why not just change it to an apostrophe, a quote, or remove it entirely?

Either way, it's a bit of a kludge... especially when you legitimately have things (like names) which may use single quotes...

NOTE: Your method also assumes everyone working on your app always remembers to sanitize input before it hits the database, which probably isn't realistic most of the time.

Kevin Fairchild
+3  A: 

If you have parameterised queries available you should be using them at all times. All it takes is for one query to slip through the net and your DB is at risk.

Kev
+5  A: 

Input sanitation is not something you want to half-ass. Use your whole ass. Use regular expressions on text fields. TryCast your numerics to the proper numeric type, and report a validation error if it doesn't work. It is very easy to search for attack patterns in your input, such as ' --. Assume all input from the user is hostile.

tom.dietrich
+8  A: 

In a nutshell: Never do query escaping yourself. You're bound to get something wrong. Instead, use parameterized queries, or if you can't do that for some reason, use an existing library that does this for you. There's no reason to be doing it yourself.

Nick Johnson
Seconded. Strongly.
Pittsburgh DBA
+3  A: 

I've used this technique when dealing with 'advanced search' functionality, where building a query from scratch was the only viable answer. (Example: allow the user to search for products based on an unlimited set of constraints on product attributes, displaying columns and their permitted values as GUI controls to reduce the learning threshold for users.)

In itself it is safe AFAIK. As another answerer pointed out, however, you may also need to deal with backspace escaping (albeit not when passing the query to SQL Server using ADO or ADO.NET, at least -- can't vouch for all databases or technologies).

The snag is that you really have to be certain which strings contain user input (always potentially malicious), and which strings are valid SQL queries. One of the traps is if you use values from the database -- were those values originally user-supplied? If so, they must also be escaped. My answer is to try to sanitize as late as possible (but no later!), when constructing the SQL query.

However, in most cases, parameter binding is the way to go -- it's just simpler.

Pontus Gagge
You can still use parameter substitution even if you're building your own queries.
Nick Johnson
You should build the SQL statement string from scratch, but still use parameter substitution.
JeeBee
No, NEVER build your SQL statements from scratch.
AviD
+1 as the only relevant answer so far
DK
+19  A: 

First of all, it's just bad practice. Input validation is always necessary, but it's also always iffy.
Worse yet, blacklist validation is always problematic, it's much better to explicitly and strictly define what values/formats you accept. Admittedly, this is not always possible - but to some extent it must always be done.
Some research papers on the subject:

Point is, any blacklist you do (and too-permissive whitelists) can be bypassed. The last link to my paper shows situations where even quote escaping can be bypassed.

Even if these situations do not apply to you, it's still a bad idea. Moreover, unless your app is trivially small, you're going to have to deal with maintenance, and maybe a certain amount of governance: how do you ensure that its done right, everywhere all the time?

The proper way to do it:

  • Whitelist validation: type, length, format or accepted values
  • If you want to blacklist, go right ahead. Quote escaping is good, but within context of the other mitigations.
  • Use Command and Parameter objects, to preparse and validate
  • Call parameterized queries only.
  • Better yet, use Stored Procedures exclusively.
  • Avoid using dynamic SQL, and dont use string concatenation to build queries.
  • If using SPs, you can also limit permissions in the database to executing the needed SPs only, and not access tables directly.
  • you can also easily verify that the entire codebase only accesses the DB through SPs...
AviD
+1 For evil unicode hack in your paper.
Brian
@Avid: When used correctly, dynamic SQL and string concatenation can be used safely with parameterized queries (i.e. with `sp_executesql` instead of `EXEC`). That is, you can dynamically generate your SQL statement so long as none of the concatenated text comes from the user. This also has performance benefits; `sp_executesql` supports caching.
Brian
@Brian, well duh :). But in reality, how often do you see programmers do that? Moreover, the typical scenario where dynamic SQL is "needed", *requires* the user input as part of the query (supposedly). If you could do sp_executesql, you wouldn't (usually) need the dynamic sql in the first place.
AviD
+3  A: 

Simple answer: It will work sometimes, but not all the time. You want to use white-list validation on everything you do, but I realize that's not always possible, so you're forced to go with the best guess blacklist. Likewise, you want to use parametrized stored procs in everything, but once again, that's not always possible, so you're forced to use sp_execute with parameters.

There are ways around any usable blacklist you can come up with (and some whitelists too).

A decent writeup is here: http://www.owasp.org/index.php/Top_10_2007-A2

If you need to do this as a quick fix to give you time to get a real one in place, do it. But don't think you're safe.

RaySir
+1  A: 

There are two ways to do it, no exceptions, to be safe from SQL-injections; prepared statements or prameterized stored procedures.

olle
A: 

I believe there are no examples how "doubling the apostrophes" sanitization solution can be hacked in T-SQL.

However, it is less universal (works only for strings) and more prone to coding errors in comparison to parametrized queries.

... which I believe the single-quote is the only string delimiter ...

One correction: if QUOTED_IDENTIFIER is set to OFF you can use regular double quotes or apostrophes to indicate string constants, but that doesn't change much.

DK
There actually ARE examples to bypass this, see my answer.
AviD
AviD, your first 2 examples (from Dec 18 at 2:03) are coding errors, but the 3rd idea with unicode looks interesting. Could you give an actual example of that?
DK
DK, it's a trusim that most security flaws are coding (or design) errors... The Unicode issue is also based on a coding bug, albeit a more subtle one. Best to read http://www.comsecglobal.com/FrameWork/Upload/SQL_Smuggling.pdf ...
AviD
But in a nutshell, SQL Server performs automatic homoglyph translation, when forcing unsupported characters into a different character set. For example, an SP that receives (nvarchar) but concatenates that to a (varchar) variable.
AviD
Please correct your answer. You first sentence is misleading. There are always working cracks.
erenon
AviD: I don't see anything in your answer that shows examples of bypassing it. There was an example of an in your PDF that shows dynamic SQL without doing a `replace` on quotes, but that's not the question.
Gabe
+2  A: 

Yeah, that should work right up until someone runs SET QUOTED_IDENTIFIER OFF and uses a double quote on you.

Edit: It isn't as simple as not allowing the malicious user to turn off quoted identifiers:

The SQL Server Native Client ODBC driver and SQL Server Native Client OLE DB Provider for SQL Server automatically set QUOTED_IDENTIFIER to ON when connecting. This can be configured in ODBC data sources, in ODBC connection attributes, or OLE DB connection properties. The default for SET QUOTED_IDENTIFIER is OFF for connections from DB-Library applications.

When a stored procedure is created, the SET QUOTED_IDENTIFIER and SET ANSI_NULLS settings are captured and used for subsequent invocations of that stored procedure.

SET QUOTED_IDENTIFIER also corresponds to the QUOTED_IDENTIFER setting of ALTER DATABASE.

SET QUOTED_IDENTIFIER is set at parse time. Setting at parse time means that if the SET statement is present in the batch or stored procedure, it takes effect, regardless of whether code execution actually reaches that point; and the SET statement takes effect before any statements are executed.

There's a lot of ways QUOTED_IDENTIFIER could be off without you necessarily knowing it. Admittedly - this isn't the smoking gun exploit you're looking for, but it's a pretty big attack surface. Of course, if you also escaped double quotes - then we're back where we started. ;)

Mark Brackett
That could work, but again, how could they get that code to execute when all user input is surrounded by single quotes? A specific line(s) of code that would be able to inject SQL into the above code would be very helpful. Thanks!
Patrick
+1  A: 

Your defence would fail if:

  • the query is expecting a number rather than a string
  • there were any other way to represent a single quotation mark, including:
    • an escape sequence such as \039
    • a unicode character

(in the latter case, it would have to be something which were expanded only after you've done your replace)

AJ
+1  A: 

Patrick, are you adding single quotes around ALL input, even numeric input? If you have numeric input, but are not putting the single quotes around it, then you have an exposure.

Rob Kraft
+9  A: 

Okay, this response will relate to the update of the question:

"If anyone knows of any specific way to mount a SQL injection attack against this sanitization method I would love to see it."

Now, besides the MySQL backslash escaping - and taking into account that we're actually talking about MSSQL, there are actually 3 possible ways of still SQL injecting your code

sSanitizedInput = "'" & Replace(sInput, "'", "''") & "'"

Take into account that these will not all be valid at all times, and are very dependant on your actual code around it:

  1. Second-order SQL Injection - if an SQL query is rebuilt based upon data retrieved from the database after escaping, the data is concatenated unescaped and may be indirectly SQL-injected. See
  2. String truncation - (a bit more complicated) - Scenario is you have two fields, say a username and password, and the SQL concatenates both of them. And both fields (or just the first) has a hard limit on length. For instance, the username is limited to 20 characters. Say you have this code:
username = left(Replace(sInput, "'", "''"), 20)

Then what you get - is the username, escaped, and then trimmed to 20 characters. The problem here - I'll stick my quote in the 20th character (e.g. after 19 a's), and your escaping quote will be trimmed (in the 21st character). Then the SQL

sSQL = "select * from USERS where username = '" + username + "'  and password = '" + password + "'"

combined with the aforementioned malformed username will result in the password already being outside the quotes, and will just contain the payload directly.
3. Unicode Smuggling - In certain situations, it is possible to pass a high-level unicode character that looks like a quote, but isn't - until it gets to the database, where suddenly it is. Since it isn't a quote when you validate it, it will go through easy... See my previous response for more details, and link to original research.

AviD
+3  A: 

I realize this is a long time after the question was asked, but ..

One way to launch an attack on the 'quote the argument' procedure is with string truncation. According to MSDN, in SQL Server 2000 SP4 (and SQL Server 2005 SP1), a too long string will be quietly truncated.

When you quote a string, the string increases in size. Every apostrophe is repeated. This can then be used to push parts of the SQL outside the buffer. So you could effectively trim away parts of a where clause.

This would probably be mostly useful in a 'user admin' page scenario where you could abuse the 'update' statement to not do all the checks it was supposed to do.

So if you decide to quote all the arguments, make sure you know what goes on with the string sizes and see to it that you don't run into truncation.

I would recommend going with parameters. Always. Just wish I could enforce that in the database. And as a side effect, you are more likely to get better cache hits because more of the statements look the same. (This was certainly true on Oracle 8)

Jørn Jensen
After posting, I decided that AviD's post covers this, and in more detail.Hopefully my post will still be of help to someone.
Jørn Jensen