There are two fundamental things that every programmer must do when sending sql code to a database: close the connections and parameterize the queries. For some reason, most tutorials available on the internet just gloss over them or even get it just plain wrong, perhaps because it's so second nature to anyone advanced enough to write the tutorial. My goal here is to show you how to do both in a way that makes it easier to get this right every time.
The first thing to do is realize that hiding the code away in one method is not enough. We actually want to build a separate class for this. By creating a separate class, we can make our actual connection method private inside that class, so that only other methods in the class can connect to the database. This way, we force all database code in the program to run through your gatekeeper method. Get the gatekeeper method right with regards to the two issues I talked about above, and your whole program will consistently get it right, too. So here's our start:
public class DataLayer
{
private DbConnection GetConnection()
{
//This could also be a connection for OleDb, ODBC, Oracle, MySQL,
// or whatever kind of database you have.
//We could also use this place (or the constructor) to load the
// connection string from an external source, like a
// (possibly-encrypted) config file
return new SqlConnection("connection string here");
}
}
Now, whenever you need to write a database query you add a method to that class. If it is a large project, you might even break the class up into several smaller classes in a dll file. But the main thing is that the class the provides this raw database connection should be private to other code that talks to the database.
But this is just to get us started. The next thing to do is adjust it to automate closing the connection. To do this, we can add a "Query" method to the class:
private DataTable Query(string sql)
{
var result = new DataTable();
using (var connection = GetConnection())
using (var command = new SqlCommand(sql, connection)
{
connection.Open();
result.Load(command.ExecuteReader(CommandBehavior.CloseConnection));
}
return result;
}
There's already a lot in here and some of it might be unfamiliar, so let's stop for a moment and look at those items. I'll start with the var
keyword, which is simple enough. var
is just a shorthand way to declare a variable. It means that the type name for the variable declaration should be inferred by the compiler. The variable is still strongly-typed, because the type of the variable is known and fixed when the program is compiled, but you the programmer don't have to go to the trouble of typing out a long name like "SqlConnection" when a simple "var" will do. This also makes it easy to write code that might be re-used on one project that uses a Sql Server database and on another project that uses an Access database, as the "var" keyword works with both declarations.
The other keyword I want to highlight is using
. This keyword is a powerful way to declare a variable in .Net and C#. The keyword creates a scope block underneath the variable declaration. At the end of the scope block, your variable is disposed. Note that there are three important parts two this. The first is that this really only applies to unmanaged resources like database connections. Memory is still collected in the usual way. The second is that the variable is disposed even if an exception is thrown. This makes the keyword suitable to use with time-sensitive or tightly-constrained resources like database connections, without a separate try/catch block nearby. The final piece is that the keywords make use of the IDisposable pattern in .Net. You don't need to know all about IDisposable right now. Just know that database connections implement (like inheritance) IDisposable, and so will work with a using block.
You don't have to use the using
keyword in your code. But if you don't, the correct way to handle a connection looks like this:
SqlConnection connection;
try
{
connection = new SqlConnection("connection string here");
SqlCommand command = new SqlCommand("sql query here", connetion);
connection.Open();
SqlDataReader reader = command.ExecuteReader();
//do something with the data reader here
}
finally
{
connection.Close();
}
And that's even the simple version. You actually still need an additional check in the finally block to make sure your connection variable is valid. The using
keyword is a much more concise way to express this, and it makes sure you get the pattern right each time. The main thing to take away here is that if you just call connection.Close()
, with no protection to make sure the program actually reaches that line, you've failed.
There's one more important thing with this method. Notice that it's still private. This is because we still don't want code outside of the class to call this method. A good programmer will not put any sql code in the same class as his windows form or asp.net page. This is a simple case of separation of concerns. Keep presentation in one place and data access in another. So, as I mentioned early, each of your queries will go in it's own method. Let's build an example now. Here's a method for a simple query to get all your employee data:
public DataTable GetEmployeeData()
{
return Query("SELECT * FROM Employees");
}
Wow, that was easy. We're really getting somewhere. Unfortunately, we're still missing one piece of the puzzle. You see, it's pretty rare to want to return an entire table. Typically, you'll want to filter that table in some way, and maybe join it with another table. So let's alter this query to return all the data for a fictional employee named "Fred":
public DataTable GetFredsEmployeeData()
{
return Query("SELECT * FROM Employees WHERE Firstname='Fred'");
}
Still pretty easy, but that misses the spirit of what we're trying to accomplish. You don't want to build another method for every possible employee name. You want something more like this:
public DataTable GetEmployeeData(string FirstName)
{
return Query("SELECT * FROM Employees WHERE FirstName='" + FirstName + "'");
}
Uh oh. Now we have a problem. There's that pesky string concatenation, just waiting for someone to come along and enter the text ';Drop table employees;--
(or worse) into your first name textbox. The correct way to handle this is using query parameters. But this is where it gets tricky, because several paragraphs back we built a query method that only accepts a finished sql string.
A lot of people want to write a method just like that Query method. I think just about every database programmer is tempted by that pattern at a certain point in there career, and unfortunately it's just plain wrong until you add a way to accept parameters. Fortunately, there are number of different way to correct this. For example, you could pass an array of SqlParameters, or even just an array of objects. Or a set of key/value pairs to use to create the parameters.
I've spent a lot of time working through the different options, and I've narrowed down what I think is the simplest, most effective, and (more importantly) more accurate and maintainable option for C#. Unfortunately, it does require that you understand the syntax for one more advanced language feature in C#: anonymous methods. What this feature allows you to do is define a function within another function, hold on to it with a variable, pass it to other functions, and call it at your leisure. Here's how we'll modify the Query() function to take advantage of this:
private DataTable Query(string sql, Action<SqlCommand> addParameters)
{
var result = new DataTable();
using (var connection = GetConnection())
using (var command = new SqlCommand(sql, connection)
{
addParameters(command); //<--- the addParameters parameter is a function we can call
connection.Open();
result.Load(command.ExecuteReader(CommandBehavior.CloseConnection));
}
return result;
}
Note the new Action<SqlCommand>
parameter. Don't mind the < >
part. If you're not familiar with generics, you can just pretend it's part of the class name for now. Just now that the special Action type here allows you to pass one function (in this case, one that takes an SqlCommand as an argument) to another function. And here's how we'd use this in the GetEmployeeData() function:
public DataTable GetEmployeeData(string firstName)
{
return Query("SELECT * FROM Employees WHERE FirstName= @Firstname",
(cmd) =>
{
cmd.Parameters.AddWithValue("@FirstName", firstName);
});
}
Now the Query() function has two parameters instead of one, and the expression for that second parameter looks a little goofy. This uses something called a lambda expression to create our special Action<SqlCommand>
argument. You don't need to know much about lambda expressions yet. Just know that the =>
token is an operator defined in C#, just like +, -, *, /, &&, or ||. It tells the compiler that you're building a new method right there on the spot. You can even have curly-braces ( { and } ) to denote where it starts and stops.
The important part of all this is that the the Query() function now has a way to connect the firstName
argument passed to it's parent GetEmployeeData() function to the @FirstName expression in the sql string. This is done using features built into ADO.Net and your sql database engine. And it happens in a way that prevents any possibility for sql injection attacks.
I'll finish (finally!) with two short items. The first is the syntax for calling your new query method with no parameters:
public DataTable GetAllEmployees()
{
return Query("SELECT * FROM Employees", (cmd) => {});
}
While we could also provide this as an overload of the original Query() function, in my own code I prefer not to do that, as I want to communicate to other developers that should be looking to parameterize their code, and not sneak around with string concatenation.
Secondly, the code outlined in this answer is still unfinished. There are some important weaknesses yet to address. For example, using a datatable rather than a datareader forces you to load the entire result set from every query into memory all at once. There are some things we can do to avoid that. We also haven't discussed inserts, updates, deletes, or alters.
In conclusion, I'm not saying this is the only way to do it. But do remember the two things you must do: close your connections, and parameterize your queries.