views:

647

answers:

5

We have a query that runs off a fairly large table that unfortunately needs to use LIKE '%ABC%' on a couple varchar fields so the user can search on partial names, etc. SQL Server 2005

Would adding an index on these varchar fields help any in terms of select query performance when using LIKE or does it basically ignore the indexes and do a full scan in those cases?

Any other possible ways to improve performance when using LIKE?

+6  A: 

Only if you add full-text searching to those columns, and use the full-text query capabilities of SQL Server.

Otherwise, no, an index will not help.

Lasse V. Karlsen
Thanks, that is what I thought unfortunately. I've removed some of the LIKE clauses to help speed thigns up a bit.
schooner
+2  A: 

Like '%ABC%' will always perform a full table scan. There is no way around that.

You do have a couple of alternative approaches. Firstly full text searching, it's really designed for this sort of problem so I'd look at that first.

Alternatively in some circumstances it might be appropriate to denormalize the data and pre-process the target fields into appropriate tokens, then add these possible search terms into a separate one to many search table. For example, if my data always consisted of a field containing the pattern 'AAA/BBB/CCC' and my users were searching on BBB then I'd tokenize that out at insert/update (and remove on delete). This would also be one of those cases where using triggers, rather than application code, would be much preferred.

I must emphasis that this is not really an optimal technique and should only be used if the data is a good match for the approach and for some reason you do not want to use full text search (and the database performance on the like scan really is unacceptable). It's also likely to produce maintenance headaches further down the line.

Cruachan
A: 

The only other way (other than using fulltext indexing) you could improve performance is to use "LIKE ABC%" - don't add the wildcard on both ends of your search term - in that case, an index could work.

If your requirements are such that you have to have wildcards on both ends of your search term, you're out of luck...

Marc

marc_s
A: 

create statistics on that column. sql srever 2005 has optimized the in string search so you might benfit from that.

Mladen Prajdic
+1  A: 

You can potentially see performance improvements by adding index(es), it depends a lot on the specifics :)

How much of the total size of the row are your predicated columns? How many rows do you expect to match? Do you need to return all rows that match the predicate, or just top 1 or top n rows?

If you are searching for values with high selectivity/uniqueness (so few rows to return), and the predicated columns are a smallish portion of the entire row size, an index could be quite useful. It will still be a scan, but your index will fit more rows per page than the source table.

Here is an example where the total row size is much greater than the column size to search across:

create table t1 (v1 varchar(100), b1 varbinary(8000))
go
--add 10k rows of filler
insert t1 values ('abc123def', cast(replicate('a', 8000) as varbinary(8000)))
go 10000
--add 1 row to find
insert t1 values ('abc456def', cast(replicate('a', 8000) as varbinary(8000)))
go

set statistics io on 
go
select * from t1 where v1 like '%456%'
--shows 10001 logical reads

--create index that only contains the column(s) to search across
create index t1i1 on t1(v1)
go
select * from t1 where v1 like '%456%'
--or can force to 
--shows 37 logical reads

If you look at the actual execution plan you can see the engine scanned the index and did a bookmark lookup on the matching row. Or you can tell the optimizer directly to use the index, if it hadn't decide to use this plan on its own: select * from t1 with (index(t1i1)) where v1 like '%456%'

If you have a bunch of columns to search across only a few that are highly selective, you could create multiple indexes and use a reduction approach. E.g. first determine a set of IDs (or whatever your PK is) from your highly selective index, then search your less selective columns with a filter against that small set of PKs.

If you always need to return a large set of rows you would almost certainly be better off with a table scan.

So the possible optimizations depend a lot on the specifics of your table definition and the selectivity of your data.

HTH! -Adrian

Also consider INCLUDE indexes if you're returning a subset of the columns in a wide table.
Rob Garrison