views:

116

answers:

3

I've been reading around and found that using LIKE causes a big slowdown in queries.

A workmate recommended we use

Select Name
From mytable
a.Name IN (SELECT Name 
           FROM mytable
           WHERE Name LIKE '%' + ISNULL(@Name, N'') + '%' 
           GROUP BY Name)

in lieu of

Select Name
From mytable
a.Name LIKE '%' + ISNULL(@Name, N'') + '%'

Now I'm no SQL expert and I don't really understand the inner workings of these statements. Is this a better option worth the effort of typing a few extra characters with each like statement? Is there an even better (and easier to type) alternative?

+3  A: 

Unless IIQR is some smaller table that indexes the names somehow (and is not the original table being queried here from the start), I don't see how that longer version helps at all; it's doing the exact same thing, but just adding in an extra step of creating a set of results which is when used in an IN.

But I'd be dubious even if IIQR is a smaller 'index' table. I'd want to see more about the database in question and what the query plan ends up being for each.

LIKE can have a negative effect on query performance because it often requires a table scan - physically loading each record's relevant field and searching for the text in question. Even if the field is indexed, this is likely the case. But there may be no way around it, if what you need to do is search for partial text at any possible location inside a field.

Depending on the size of the table in question, though; it may really not matter at all.

For you, though; I would suggest that keeping it simple is best. Unless you really do know what the whole effect of complicating a query would be on performance, it can be hard to try to decide which way to do things.

Andrew Barber
+7  A: 

Simply compare the execution plans and you should see the difference.

I don't have your exact data, but I ran the following queries against a SQL Server 2005 database of mine (yes, it's nerdy):

SELECT     UnitName
FROM         Units
WHERE     (UnitName LIKE '%Space Marine%')

SELECT     UnitName
FROM         Units
WHERE     UnitName IN (
   (SELECT UnitName FROM Units 
   WHERE UnitName LIKE '%Space Marine%' GROUP BY UnitName)
)

Here were my execution plan results:

alt text

Your co-worker's suggestion adds a nested loop and a second clustered index scan to my query as you can see above. Your mileage may vary, but definitely check the execution plans to see how they compare. I can't imagine how it would be more efficient.

LittleBobbyTables
+1 for the nerdy db. XD Used for a game perhaps?
Jonn
@Jonn -- if I ever finish it, it'll be used for a game. At the rate I'm going, not likely :)
LittleBobbyTables
Hehe.. I think I know what you mean. Hope you finish it sometime.
Jonn
+6  A: 

There are a couple of performance issues to address...

Don't Access the Same Table More Than Once, If Possible

Don't use a subquery for criteria that can be done without the need for referencing additional copies of the same table. It's acceptable if you need data from a copy of the table due to using aggregate functions (MAX, MIN, etc), though analytic functions (ROW_NUMBER, RANK, etc) might be more accommodating (assuming supported).

Don't Compare What You Don't Need To

If your parameter is NULL, and that means that you want any value for the columns you are comparing against, don't include filtration criteria. Statements like these:

WHERE a.Name LIKE '%' + ISNULL(@Name, N'') + '%'

...guarantee the optimizer will have to compare values for the name column, wildcarding or not. Worse still in the case with LIKE is that wildcarding the left side of the evaluation ensures that an index can't be used if one is present on the column being searched.

A better performing approach would be:

IF @Name IS NOT NULL 
BEGIN
   SELECT ...
     FROM ...
    WHERE a.name LIKE '%' + @Name + '%'
END
ELSE 
BEGIN
   SELECT ...
     FROM ...
END

Well performing SQL is all about tailoring to exactly what you need. Which is why you should be considering dynamic SQL when you have queries with two or more independent criteria.

Use The Right Tool

The LIKE operator isn't very efficient at searching text when you're checking for the existence of a string within text data. Full Text Search (FTS) technology was designed to address the shortcomings:

IF @Name IS NOT NULL
BEGIN
   SELECT ...
     FROM ...
    WHERE CONTAINS(a.name, @Name) 
END
ELSE
BEGIN
   SELECT ...
     FROM ...
END

Always Test & Compare

I agree with LittleBobbyTables - the solution ultimately relies on checking the query/execution plan for all the alternatives because table design & data can impact optimizer decision & performance. In SQL Server, the one with the lowest subtreecost is the most efficient, but it can change over time if the table statistics and indexes aren't maintained.

OMG Ponies
There's a CONTAINS function??? Does TSQL have that? 0_o
Jonn