views:

1527

answers:

21

We are having another discussion here at work about using parametrized sql queries in our code. We have two sides in the discussion: Me and some others that say we should always use parameters to safeguard against sql injections and the other guys that don't think it is necessary. Instead they want to replace single apostrophes with two apostrophes in all strings to avoid sql injections. Our databases are all running Sql Server 2005 or 2008 and our code base is running on .NET framework 2.0.

Let me give you a simple example in C#:

I want us to use this:

string sql = "SELECT * FROM Users WHERE Name=@name";
SqlCommand getUser = new SqlCommand(sql, connection);
getUser.Parameters.AddWithValue("@name", userName);
//... blabla - do something here, this is safe

While the other guys want to do this:

string sql = "SELECT * FROM Users WHERE Name=" + SafeDBString(name);
SqlCommand getUser = new SqlCommand(sql, connection);
//... blabla - are we safe now?

Where the SafeDBString function is defined as follows:

string SafeDBString(string inputValue) 
{
    return "'" + inputValue.Replace("'", "''") + "'";
}

Now, as long as we use SafeDBString on all string values in our queries we should be safe. Right?

There are two reasons to use the SafeDBString function. First, it is the way it has been done since the stone ages, and second, it is easier to debug the sql statements since you see the excact query that is run on the database.

So then. My question is whether it really is enough to use the SafeDBString function to avoid sql injection attacks. I have been trying to find examples of code that breaks this safety measure, but I can't find any examples of it.

Is there anybody out there that can break this? How would you do it?

EDIT: To summarize the replies so far:

  • Nobody has found a way to get around the SafeDBString on Sql Server 2005 or 2008 yet. That is good, I think?
  • Several replies pointed out that you get a performance gain when using parametrized queries. The reason is that the query plans can be reused.
  • We also agree that using parametrized queries give more readable code that is easier to maintain
  • Further it is easier to always use parameters than to use various versions of SafeDBString, string to number conversions and string to date conversions.
  • Using parameters you get automatic type conversion, something that is especially useful when we are working with dates or decimal numbers.
  • And finally: Don't try to do security yourself as JulianR wrote. The database vendors spend lots of time and money on security. There is no way we can do better and no reason we should try to do their job.

So while nobody was able to break the simple security of the SafeDBString function I got lots of other good arguments. Thanks!

+1  A: 

I'd use stored procedures or functions for everything, so the question wouldn't arise.

Where I have to put SQL into code, I use parameters, which is the only thing that makes sense. Remind the dissenters that there are hackers smarter than they are, and with better incentive to break the code that's trying to outsmart them. Using parameters, it's simply not possible, and it's not like it's difficult.

John Saunders
Things are never impossible :)
erikkallen
Ok, how to do SQL injection using parameters?
John Saunders
@Saunders: Step 1 is to find a buffer overflow bug in the parameter-handling functionality of your DB.
Brian
Found one yet? In a commercial DB that's being pounded on by hundreds of thousands of hackers daily? One made by a software company known to have very deep pockets? You'd be able to quote the lawsuit by _name_ if this were possible.
John Saunders
Of course, if the SPROC uses concatenation and EXEC (instead of sp_ExecuteSQL) you're back in trouble... (I've seen it done wrong too many times to discount it...)
Marc Gravell
+7  A: 

This answers much better than I could

spender
A nice article, but I don't see how one could use it to get around the SafeDBString function...
Rune Grimstad
+4  A: 

I have used both approaches to avoid SQL injection attacks and definitely prefer parametrized queries. When I have used concatenated queries I have used a library function to escape the variables (like mysql_real_escape_string) and wouldn't be confident I have covered everything in a proprietary implementation (as it seems you are too).

Cannonade
+1 because mysql_real_escape_string() escapes \x00, \x1a, \n \r ' and ". It also handles character set issues. The OP's coworkers naive function doesn't do any of that!
Bill Karwin
+53  A: 

And then somebody goes and uses " instead of '. Parameters are, IMO, the only safe way to go.

It also avoids a lot of i18n issues with dates/numbers; what date is 01/02/03? How much is 123,456? Do your servers (app-server and db-server) agree with each-other?

If the risk factor isn't convincing to them, how about performance? The RDBMS can re-use the query plan if you use parameters, helping performance. It can't do this with just the string.

Marc Gravell
I have tried the formatting and performance arguments, but they still aren't convinced.
Rune Grimstad
+1 for dates etc. Parameters give multiple benefits.
Richard
Actually, sql server can re-use the query plan whether you use parameters or not. I agree with the other arguments, but for most cases the performance argument for parameterized sql doesn't fly anymore.
tnyfst
@tnyfst: it can reuse the execution plan when the query string changes for every combination of parameter values? I did not think that possible.
John Saunders
The query plan will be reused if the query text is IDENTICAL to an earlier query text. So if you send the EXACT SAME query twice, it will be reused. However, if you change even just a space or a comma or something, a new query plan will have to be determined.
marc_s
@tnyfst - the point is that the query plan to fetch record "12345" can be re-used to fetch record "67890" if it uses a paramerized @id
Marc Gravell
@Marc: I'm not sure you are entirely correct. SQL Servers caching hueristics are a little weird. The parser is capable of identifying constants in the text and can convert the SQL string to one the uses parameters artificially. It can then insert into the cache the text of this new parameterised query. Subsequent similar SQL may find its parameterised version matched in the cache. However, parameterised versions aren't always used with the originaly SQL versions being cached instead, I suspect SQL has a zillion performance related reasons to choose between the two approaches.
AnthonyWJones
Indeed it is complex; but we can make the job simpler, and reduce the number of candidates...
Marc Gravell
+15  A: 

First of all, your sample for the "Replace" version is wrong. You need to put apostrophes around the text:

string sql = "SELECT * FROM Users WHERE Name='" + SafeDBString(name) & "'";
SqlCommand getUser = new SqlCommand(sql, connection);

So that's one other thing parameters do for you: you don't need to worry about whether or not a value needs to be enclosed in quotes. Of course, you could build that into the function, but then you need to add a lot of complexity to the function: how to know the difference between 'NULL' as null and 'NULL' as just a string, or between a number and a string that just happens to contain a lot of digits. It's just another source for bugs.

Another thing is performance: parameterized query plans are often cached better than concatenated plans, thus perhaps saving the server a step when running the query.

Additionally, escaping single quotes isn't good enough. Many DB products allow alternate methods for escaping characters that an attacker could take advantage of. In MySQL, for example, you can also escape a single quote with a backslash. And so the following "name" value would blow up MySQL with just the SafeDBString() function, because when you double the single quote the first one is still escaped by the backslash, leaving the 2nd one "active":

x\' OR 1=1;--


Also, JulianR brings up a good point below: NEVER try to do security work yourself. It's so easy to get security programming wrong in subtle ways that appear to work, even with thorough testing. Then time passes and a year later your find out your system was cracked six months ago and you never even knew it until just then.

Always rely as much as possible on the security libraries provided for your platform. They will be written by people who do security code for a living, much better tested than what you can manage, and serviced by the vendor if a vulnerability is found.

Joel Coehoorn
The replace function adds the apostrophes
Rune Grimstad
Then it's just one more source of bugs. How does it know the difference between NULL as a null value and NULL as a text string? Or between a number input and a string that just happens to contain digits?
Joel Coehoorn
Good point. You should only use the function for strings, and possibly dates, so you have to be careful. That is one more reason to use parameters! Yay!
Rune Grimstad
+7  A: 

With parameterised queries you get more than protection against sql injection. You also get better execution plan caching potential. If you use the sql server query profiler you can still see the 'exact sql that is run on the database' so you're not really losing anything in terms of debugging your sql statements either.

Steve Willcock
+1, Good rebuttal to the pro-SafeDBString arguments.
sheepsimulator
MySQL also logs parameterized queries with param values interpolated into them.
Bill Karwin
+4  A: 

You aren't able to easily do any type checking of the user input without using parameters.

If you use the SQLCommand and SQLParameter classes to make you're DB calls, you can still see the SQL query that's being executed. Look at the SQLCommand's CommandText property.

I'm always a litle suspect of the roll-your-own approach to preventing SQL injection when parameterized queries are so easy to use. Second, just because "it's always been done that way" doesn't mean it's the right way to do it.

Tim
+1  A: 

From the very short time I've had to investigate SQL injection problems, I can see that making a value 'safe' also means that you're shutting the door to situations where you might actually want apostrophes in your data - what about someone's name, eg O'Reilly.

That leaves parameters and stored procedures.

And yes, you should always try to implement code in the best way you know now - not just how its always been done.

quamrana
The double apostrophes will be translated by sql server into a single apostrophe, so O'Reilly would be translated into Name = 'O''Reilly'
Rune Grimstad
So is there a corresponding function to remove apostrophes when the user wants to see their data?
quamrana
A: 

Discussions of this type are a sign that some people are not doing real work. See if you can't shove a few bugs their way.

Andomar
+3  A: 

This is only safe if you're guaranteed that you're going to pass in a string.

What if you're not passing in a string at some point? What if you pass just a number?

http://www.mywebsite.com/profile/?id=7;DROP DATABASE DB

Would ultimately become:

SELECT * FROM DB WHERE Id = 7;DROP DATABASE DB
joshcomley
It's either a string or a number. A string is escaped with SafeDbString. A number is an Int32, and it can't drop databases.
Andomar
Numbers are easier to handle. You just convert the parameter to an int/float/whatever before using it in the query. The problem is when you must accept string data.
Rune Grimstad
Andomar - if you're just constructing a SQL statement by hand then it's intended "type" doesn't matter, you can SQL inject with a number very, very easily.Rune - I think this is relying far too much on the individual developer to remember all the nuances of manually solving SQL injection. If you just say "use parameters" it's very simple and they can't go wrong.
joshcomley
@josh - I agree. Parameters are much easier. Unfortunately I have coworkers that don't agree...
Rune Grimstad
@Andomar: What about NULL? Or strings that look like numbers?
Joel Coehoorn
@Rune - I feel your pain
joshcomley
+100 : It looks like when these developers are expecting a number, they will shove it straight in.
ck
+1  A: 

Here are a couple of articles that you might find helpful in convincing your co-workers.

http://www.sommarskog.se/dynamic_sql.html

http://unixwiz.net/techtips/sql-injection.html

Personally I prefer to never allow any dynamic code to touch my database, requiring all contact to be through sps (and not one which use dynamic SQl). This means nothing excpt what I have given users permission to do can be done and that internal users (except the very few with production access for admin purposes) cannot directly access my tables and create havoc, steal data or commit fraud. If you run a financial application, this is the safest way to go.

HLGEM
+2  A: 

Agree hugely on the security issues.
Another reason to use parameters is for efficiency.

Databases will always compile your query and cache it, then re-use the cached query (which is obviously faster for subsequent requests). If you use parameters then even if you use different parameters the database will re-use your cached query as it matches based on the SQL string before binding the parameters.

If however you don't bind parameters then the SQL string changes on every request (that has different parameters) and it will never match what's in your cache.

Darren Greaves
+41  A: 

I think the correct answer is:

Don't try to do security yourself. Use whatever trusted, industry standard library there is available for what you're trying to do, rather than trying to do it yourself. Whatever assumptions you make about security, might be incorrect. As secure as your own approach may look (and it looks shaky at best), there's a risk you're overlooking something and do you really want to take that chance when it comes to security?

Use parameters.

JulianR
A: 

Here are a few reasons to use parameterized queries:

  1. Security - The database access layer knows how to remove or escape items that are not allowed in data.
  2. Separation of concerns - My code is not responsible for transforming the data into a format that the database likes.
  3. No redundancy - I don't need to include an assembly or class in every project that does this database formatting/escaping; it's built in to the class library.
R. Bemrose
+2  A: 

It somewhat amuses me to see people go to great lengths when they can just use parametrized queries. I mean, they solve so many problems at once, it should be a no-brainer! I've yet to hear a sensible reason why NOT to use them. I mean, you don't output html using echo, now, do you? :P

shylent
+14  A: 

The argument is a no-win. If you do manage to find a vulnerability, your co-workers will just change the SafeDBString function to account for it and then ask you to prove that it's unsafe all over again.

Given that parametrized queries are an undisputed programming best practice, the burden of proof should be on them to state why they aren't using a method that is both safer and better performing.

If the issue is rewriting all the legacy code, the easy compromise would be to use parametrized queries in all new code, and refactor old code to use them when working on that code.

My guess is the actual issue is pride and stubbornness, and there's not much more you can do about that.

Matthew Christensen
+1 for pointing out the real issue at hand.
Robert Gowland
Definitely. This minimizes the pushback from people who will just argue to avoid having to immediately retrofit all their code.
Chris Farmer
-1 for using a non-existant word: 'performant'
Kieveli
A: 

There were few vulnerability(I can't remember which database it was) that is related to buffer overflow of the SQL statement.

What I want to say is, SQL-Injection is more then just "escape the quote", and you have no idea what will come next.

Dennis Cheung
+1  A: 

I did not see any other answsers address this side of the 'why doing it yourself is bad', but consider a SQL Truncation attack.

There is also the QUOTENAME T-SQL function that can be helpful if you can't convince them to use params. It catches a lot (all?) of the escaped qoute concerns.

JasonRShaver
+1  A: 

It can be broken, however the means depends on exact versions/patches etc.

One that has already been brought up is the overflow/truncation bug that can be exploited.

Another future means would be finding bugs similar to other databases - for example the MySQL/PHP stack suffered an escaping problem because certain UTF8 sequences could be used to manipulate the replace function - the replace function would be tricked into introducing the injection characters.

At the end of the day, the replacement security mechanism relies on expected but not intended functionality. Since the functionality was not the intended purpose of the code, there is a high probablity that some discovered quirk will break your expected functionality.

If you have a lot of legacy code, the replace method could be used as a stopgap to avoid lengthy rewriting and testing. If you are writing new code, there is no excuse.

David
+8  A: 

So I'd say:

1) Why are you trying to re-implement something that's built in? it's there, readily available, easy to use and already debugged on a global scale. If future bugs are found in it, they'll be fixed and available to everyone very quickly without you having to do anything.

2) What processes are in place to guarantee that you never miss a call to SafeDBString? Missing it in just 1 place could open up a whole host of issues. How much are you going to eyeball these things, and consider how much wasted that effort is when the accepted correct answer is so easy to reach.

3) How certain are you that you've covered off every attack vector that Microsoft(the author of the DB and the access library) knows about in your SafeDBString implementation ...

4) How easy is it to read the structure of the sql? The example uses + concatenation, parameters are very like string.Format, which is more readable.

Also, there are 2 ways of working out what was actually run - roll your own LogCommand function, a simple function with no security concerns, or even look at an sql trace to work out what the database thinks is really going on.

Our LogCommand function is simply:

    string LogCommand(SqlCommand cmd)
    {
        StringBuilder sb = new StringBuilder();
        sb.AppendLine(cmd.CommandText);
        foreach (SqlParameter param in cmd.Parameters)
        {
            sb.Append(param.ToString());
            sb.Append(" = \"");
            sb.Append(param.Value.ToString());
            sb.AppendLine("\"");
        }
        return sb.ToString();
    }

Right or wrong, it gives us the information we need without security issues.

Jim T
+1, very complete, and easy to understand!
sheepsimulator
He's probably got to deal with a bunch of old VBSCRIPT programmers who are used to doing everything, including XML and SQL, through string concatenation. These will be people who are scared by the use of an API. There's nothing much that can be done with them, at least nothing humane.
John Saunders
+1 for item #2, with the exception that there's no way to enforce real parameters either.
Joel Coehoorn
Your LogCommand is invalid. The escaped double quote is wrong, as it should be a single quote for strings. And it just plain fails on param values that are numeric. I'm not even sure if it works for boolean values. Your alternative is using SQL trace which is quite messy. I like parameterized queries but given only LogCommand() and looking at sql trace, I'd rather go with SafeDBString(), and that's the truth!
James
I'm not sure what you mean by "invalid". This is creating a string for a log file, stdout, exceptions, etc, not creating a string to be run anywhere. It's only job is to let us see which sp was run, and which parameters were passed, not reverse that nice safe sqlcommand into raw sql.
Jim T
A: 

Another important consideration is keeping track of escaped and unescaped data. There are tons and tons of applications, Web and otherwise, that don't seem to properly keep track of when data is raw-Unicode, &-encoded, formatted HTML, et cetera. It's obvious that it will become difficult to keep track of which strings are ''–encoded and which aren't.

It's also a problem when you end up changing the type of some variable — perhaps it used to be an integer, but now it's a string. Now you have a problem.

Paul Fisher