views:

77

answers:

3

In terms of performance, how does the like operator behaves when applied to strings with multiple % placeholders?

for example:

select A from table_A where A like 'A%' 

takes the same time to select than

select A from table_A where A like 'A%%' 

???

+1  A: 

It will treat them same. If there is an index on column A, it will use that index just as it would with a single wildcard. However, if you were to add a leading wildcard, that would force a table scan regardless of whether an index existed or not.

Thomas
+5  A: 

Your queries:

select A from table_A where A like 'A%' 

and

select A from table_A where A like 'A%%'
                                      ^ optimizer will remove second redundant %

are equivalent, the optimizer will remove the second % in the second query

just like it would remove the 1=1 from:

select A from table_A where A like 'A%%' and 1=1

However, this query is very different:

select A from table_A where A like '%A%' 

The when using 'A%' it will use the index to find everything starting with an A, like a person using a phone book would quickly look for the start of a name. However when using '%A%' it will scan the entire table looking for anything containing an A, thus slower and no index usage. Like if you had to find every name in the phone book that contained an A, that would take a while!

KM
Thanks for the answer. That was precisely what I was trying to understand. If the optimizer would get rid of the duplicated %. Could you please share any link for reading further details on the optimizer?
Jazz
check out this site: http://www.sommarskog.se/index.html. the Dynamic Search Conditions article will go into a little on how the optimizer works
KM
Great link! Thank you very much!
Jazz
A: 

For the most part the pattern that you're using will not affect the performance of the query. The key to the performance for this is the appropriate use of indexes. In your example, an index on the column will work well because it will seek values that start with 'A', then match the full pattern. There may be some more-challenging patterns around, but the performance difference is negligible between them.

There is one important condition where the wildcard character will hurt performance. And, that is when it is at the beginning of of the pattern. For, example, '%A' will gain no benefit from an index, because it indicates you want to match on any value that starts with any valid character. All rows must be evaluated to meet this criteria.

bobs
and causes and index scan, rather than an index seek.
DForck42