views:

1592

answers:

7

I'm working on an application for work that is going to query our employee database. The end users want the ability to search based on the standard name/department criteria, but they also want the flexibility to query for all people with the first name of "James" that works in the Health Department. The one thing I want to avoid is to simply have the stored procedure take a list of parameters and generate a SQL statement to execute, since that would open doors to SQL injection at an internal level.

Can this be done?

A: 

My first thought was to write a query something like this...

SELECT EmpId, NameLast, NameMiddle, NameFirst, DepartmentName
  FROM dbo.Employee
       INNER JOIN dbo.Department ON dbo.Employee.DeptId = dbo.Department.Id
 WHERE IdCrq IS NOT NULL
       AND
       (
          @bitSearchFirstName = 0
          OR
          Employee.NameFirst = @vchFirstName
       )
       AND
       (
          @bitSearchMiddleName = 0
          OR
          Employee.NameMiddle = @vchMiddleName
       )
       AND
       (
          @bitSearchFirstName = 0
          OR
          Employee.NameLast = @vchLastName
       )
       AND
       (
          @bitSearchDepartment = 0
          OR
          Department.Id = @intDeptID
       )

...which would then have the caller provide a bit flag if they want to search a particular field and then supply the value if they are to search for it, but I don't know if this is creating a sloppy WHERE clause or if I can get away with a CASE statement in the WHERE clause.

As you can see this particular code is in T-SQL, but I'll gladly look at some PL-SQL / MySQL code as well and adapt accordingly.

Dillie-O
+5  A: 

The most efficient way to implement this type of search is with a stored procedure. The statement shown here creates a procedure that accepts the required parameters. When a parameter value is not supplied it is set to NULL.

CREATE PROCEDURE ps_Customers_SELECT_NameCityCountry
@Cus_Name varchar(30) = NULL,
@Cus_City varchar(30) = NULL,
@Cus_Country varchar(30) =NULL
AS
SELECT Cus_Name,
       Cus_City,
       Cus_Country
FROM Customers
WHERE Cus_Name = COALESCE(@Cus_Name,Cus_Name) AND
      Cus_City = COALESCE(@Cus_City,Cus_City) AND
      Cus_Country = COALESCE(@Cus_Country,Cus_Country)

Taken from this page: http://www.sqlteam.com/article/implementing-a-dynamic-where-clause

I've done it before. It works well.

BoltBait
I don't think these work well at all. You get a huge percentage of table scans, because those predicates are not SARGable.
Pittsburgh DBA
When you say it the value is set to NULL, do you mean searches for NULL in the column name or it gets ignored. The only concern I see with that is if I'm searching for folks with the last name of Schmoe, that having a first name of "Joe" is going to get excluded since the value is not NULL.
Dillie-O
Dillie-O, look up the COALESCE command to see why this works (or follow the link provided in my post). As for the performance of this type of thing... the system I implemented this on had between 1 and 2 million rows and it worked fine. It didn't seem slow at all. YMMV.
BoltBait
+1  A: 

It can be done, but usually these kitchen-sink procedures result in some poor query plans.

Having said all that, here is the tactic most commonly used for "optional" parameters. The normal approach is to treat NULL as "ommitted".

SELECT
  E.EmployeeID,
  E.LastName,
  E.FirstName
WHERE
  E.FirstName = COALESCE(@FirstName, E.FirstName) AND
  E.LastName = COALESCE(@LastName, E.LastName) AND
  E.DepartmentID = COALESCE(@DepartmentID, E.DepartmentID)

EDIT: A far better approach would be parameterized queries. Here is a blog post from one of the world's foremost authorities in this domain, Frans Bouma from LLBLGen Pro fame:

Stored Procedures vs. Dynamic Queries

Pittsburgh DBA
In the previous answer, you said that these don't work that well, do you think I just bite the bullet, do plenty of input sanitizing and create an AdHoc query, or create more specialized procedures for all the different options?
Dillie-O
Well, this is one reason why ORM (Object Relational Model) systems are coming into vogue really strongly. Most of them use dynamic query generation, but they use parameterization so that you don't get injection issues. Look at parameterized queries. That might serve you better in this case.
Pittsburgh DBA
+2  A: 

Erland Sommarskog's article Dynamic Search Conditions in T-SQL is a good reference on how to do this. Erland presents a number of strategies on how to do this without using dynamic SQL (just plain IF blocks, OR, COALESCE, etc) and even lists out the performance characteristics of each technique.

In case you have to bite the bullet and go through the Dynamic SQL path, you should also read Erland's Curse and Blessings of Dynamic SQL where he gives out some tips on how to properly write dynamic SQLs

jop
Great references! I always enjoy Erland Sommarskog's work.
Pittsburgh DBA
+4  A: 

While the COALESCE trick is neat, my preferred method is:

CREATE PROCEDURE ps_Customers_SELECT_NameCityCountry
    @Cus_Name varchar(30) = NULL
    ,@Cus_City varchar(30) = NULL
    ,@Cus_Country varchar(30) = NULL
    ,@Dept_ID int = NULL
    ,@Dept_ID_partial varchar(10) = NULL
AS
SELECT Cus_Name
       ,Cus_City
       ,Cus_Country
       ,Dept_ID
FROM Customers
WHERE (@Cus_Name IS NULL OR Cus_Name LIKE '%' + @Cus_Name + '%')
      AND (@Cus_City IS NULL OR Cus_City LIKE '%' + @Cus_City + '%')
      AND (@Cus_Country IS NULL OR Cus_Country LIKE '%' + @Cus_Country + '%')
      AND (@Dept_ID IS NULL OR Dept_ID = @DeptID)
      AND (@Dept_ID_partial IS NULL OR CONVERT(varchar, Dept_ID) LIKE '%' + @Dept_ID_partial + '%')

These kind of SPs can easily be code generated (and re-generated for table-changes).

You have a few options for handling numbers - depending if you want exact semantics or search semantics.

Cade Roux
How would this work for my Department Id field? Can I use '%' for an int or do I specify a different syntax?
Dillie-O
You have a few options for handling numbers - depending if you want exact semantics or search semantics.
Cade Roux
I just saw the update to the procedure. Thanks!!!
Dillie-O
Man, that is exactly what I was looking for. Thanks!
Electrons_Ahoy
A: 

I would stick with the NULL/COALESCE method over AdHoc Queries, and then test to make sure you don't have performance problems.

If it turns out that you have slow running queries because it's doing a table scan when you're searching on columns that are indexed, you could always supplement the generic search stored procedure with additional specific ones that allow searching on these indexed fields. For instance, you could have a special SP that does searches by CustomerID, or Last/First Name.

Aheho
I have worked with CRM applications where the search screen has 30 fields. Permutations work against you there. ORMs shine in this area.
Pittsburgh DBA
I'm not suggesting that you create a different SP for each permutation. (That would be 2^30 stored procedures). Wouldn't you agree that it makes sense to have a special case SP in the event that a user searches by CustomerID (Which I assume is unique)?
Aheho
Again, I'm only suggesting this if he find that the COALESCE method produces sub-optimal queryplans if CustomerID is present
Aheho
+2  A: 

Using the COALESCE method has a problem in that if your column has a NULL value, passing in a NULL search condition (meaning ignore the search condition) will not return the row in many databases.

For example, try the following code on SQL Server 2000:

CREATE TABLE dbo.Test_Coalesce (
    my_id INT NOT NULL IDENTITY,
    my_string VARCHAR(20) NULL )
GO
INSERT INTO dbo.Test_Coalesce (my_string) VALUES (NULL)
INSERT INTO dbo.Test_Coalesce (my_string) VALUES ('t')
INSERT INTO dbo.Test_Coalesce (my_string) VALUES ('x')
INSERT INTO dbo.Test_Coalesce (my_string) VALUES (NULL)
GO
DECLARE @my_string  VARCHAR(20)
SET @my_string = NULL
SELECT * FROM dbo.Test_Coalesce WHERE my_string = COALESCE(@my_string, my_string)
GO

You will only get back two rows because in the rows where the column my_string is NULL you are effective getting:

my_string = COALESCE(@my_string, my_string) =>
my_string = COALESCE(NULL, my_string) =>
my_string = my_string =>
NULL = NULL

But of course, NULL does not equal NULL.

I try to stick with:

SELECT
     my_id,
     my_string
FROM
     dbo.Test_Coalesce
WHERE
     (@my_string IS NULL OR my_string = @my_string)

Of course, you can adjust that to use wild cards or whatever else you want to do.

Tom H.
Nice clarification.
Pittsburgh DBA