views:

1702

answers:

5

I have a search query that I'm inheriting and attempting to optimize. I am curious to hear if anyone has any best practices and recommendations for such. The production server is still SQL Server 2000 also.

The query is an advanced customer search stored procedure that accepts 5 different search criteria parameters (i.e. first name, last name, address, phone, etc.) to search a multi-million record table. There are indexes on all joined columns and columns in the WHERE clause. In addition, the initial query dumps the records into a table variable for paging capacity.

INSERT INTO   @tempCustTable (CustomerID, FirstName, LastName, City, StateProvince, Zip, PhoneNumber)
SELECT  DISTINCT cu.CustomerID, cu.FirstName, cu.LastName, a.City,
a.StateProvince, a.Zip, p.PhoneNumber
FROM Customer cu WITH(NOLOCK)
LEFT OUTER JOIN Address a WITH(NOLOCK) ON cu.CustomerID = a.CustomerID
LEFT OUTER JOIN Phone p WITH(NOLOCK) ON cu.CustomerID = p.CustomerID
WHERE  (cu.LastName = @LastName OR cu.LastName LIKE @LastName + '%') 
AND (@FirstName IS NULL OR cu.FirstName = @FirstName OR cu.FirstName LIKE @FirstName + '%')
AND (@StateProvince = '' OR a.StateProvince LIKE @StateProvince)
AND (@City = '' OR a.City LIKE @City + '%')
AND (@Zip = '' OR a.Zip = @Zip OR a.Zip LIKE @Zip + '%')
ORDER BY cu.LastName, cu.FirstName

Does anyone have any recommendations on how I could improve the performance of the query?

A: 
  • Avoid "OR"s - they in general prevent the use of indexes
  • Never put a "%" on the left side. - same reason.
Otávio Décio
Yes, it may limit your searching but LIKE '% murders performance, and if this is a problem, time to upgrade your server, or at least the I/O to a fiber sans storage.
TravisO
JamesEggers
+2  A: 

isn't this whole line

AND (@Zip = '' OR a.Zip = @Zip OR a.Zip LIKE @Zip + '%')

the same as this

AND (a.Zip LIKE @Zip + '%')

for sure

AND (a.Zip LIKE @Zip + '%')

it is the same as

a.Zip = @Zip OR a.Zip LIKE @Zip + '%'
SQLMenace
Good question. I need to look up SQL OR again to see if it evaluates both sides regardless of return of the left side.
JamesEggers
There is one difference actually this @Zip = '' will also return NULL values in a.Zip is that what you want?
SQLMenace
The @City/@State/@Zip IS NULL or empty logic is addressed and changed prior to the query in the question so @Zip=''is valid.
JamesEggers
A: 

I would try to not have my sql code add the '%' but instead expect the parameter to already have it, this of course, after you have validated it in your application! Then don't include '=' comparisons, use LIKE all the time:

WHERE (cu.LastName LIKE @LastName)

instead of:

WHERE (cu.LastName = @LastName OR cu.LastName LIKE @LastName + '%')

Ricardo Villamil
Why have the LastName = @LastName at all? If LastName = @LastName then it will also evaluate to true for LastName LIKE LastName + '%'
Tom H.
A: 

You could build up the query with dynamic sql. That would get rid of most of your ORs and would also mean you would only need to include in the WHERE statement lines for the parameters the user did actually enter.

If you do this, be sure to use sp_executesql rather than exec so you can parameterize the dynamic sql so the query plan can be cached.

Ian1971
A: 

You can definitely clean up a lot of the redundancy in your code as SQLMenace pointed out as a start.

Another thing is, ORDER BY shouldn't be used with an INSERT..SELECT. ORDER BY is meaningless in this context. People occasionally use it to force an IDENTITY column to behave a certain way, but that's a bad habit IMO.

I don't know if this will help in your situation, but one thing that I came across recently was that in stored procedures SQL Server (I'm using 2005, but probably true for 2000 as well) will not short-circuit an OR condition in many cases. For example, when you use:

@my_parameter IS NULL OR my_column = @my_parameter

it will still evaluate the second half even if you pass in a NULL value for @my_parameter. This happened even when I set the stored procedure to recompile (and the SELECT). The trick was to force a short-circuit through the use of a CASE statement. Using that trick (and removing some redundancy) your statement would look like this:

INSERT INTO @tempCustTable
(
     CustomerID,
     FirstName,
     LastName,
     City,
     StateProvince,
     Zip,
     PhoneNumber
)
SELECT DISTINCT
     cu.CustomerID,
     cu.FirstName,
     cu.LastName,
     a.City,
     a.StateProvince,
     a.Zip,
     p.PhoneNumber
FROM Customer cu WITH(NOLOCK)
LEFT OUTER JOIN Address a WITH(NOLOCK) ON cu.CustomerID = a.CustomerID
LEFT OUTER JOIN Phone p WITH(NOLOCK) ON cu.CustomerID = p.CustomerID
WHERE
     (cu.LastName LIKE @LastName + '%') AND
     (1 =
          CASE
               WHEN @FirstName IS NULL THEN 1
               WHEN cu.FirstName LIKE @FirstName + '%' THEN 1
               ELSE 0
          END
     ) AND
     (1 =
          CASE
               WHEN @StateProvince = '' THEN 1
               WHEN a.StateProvince = @StateProvince THEN 1
               ELSE 0
          END
     ) AND
     (1 = CASE
               WHEN @City = '' THEN 1
               WHEN a.City LIKE @City + '%' THEN 1
               ELSE 0
          END
     ) AND
     (1 = CASE
               WHEN @Zip = '' THEN 1
               WHEN a.Zip LIKE @Zip + '%' THEN 1
               ELSE 0
          END
     )

It makes the query longer, and possibly a little more complex, but it may be worth it for better performance. This is particularly true if your criteria includes a subquery that could otherwise be short-circuited.

Finally... be consistent with your parameters. For @FirstName you check for a NULL value to determine if it's used or not, but for the others you are checking for empty strings. Basic coding 101 here that you need to be careful about.

Tom H.