tags:

views:

2263

answers:

6

Would

WHERE substr(my_field,1,6) = 'search'

or

WHERE my_field LIKE 'search%'

be faster in Oracle, or would there be no difference?

+4  A: 

Of the two options provided, definitely LIKE. The substring method will have to be executed against all rows in the table. Using LIKE will allow the use of indexes.

To check my answer, just profile the results. It should be clear as day.

beach
LIKE is possibly faster, not definitely. Even if there is an index, it may or may not be used depending on the selectivity.
AlexKuznetsov
@Alex - True. But in all cases it is better strive for the best and allow the server the option to use indexes if available. Using the substring method guarantees that no indexes will be used.
beach
Agreed. But you can slap a function-based index on SUBSTR(my_field,1,6) that would be more likely to be used than a regular index on my_field for the LIKE.
Justin Cave
That would work for 'search', but what if you wanted 'find'? The function-based index on SUBSTR(my_field,1,6) wouldn't work. Would you then propose to add another index on SUBSTR(my_field,1,4)? That seems like a waste of space to me. But I suppose it all comes down to appdomain and how it is going to be used. Personally, I would play it safe and use the LIKE. But I wouldn't shy away from the specific SUBSTR() index if it vastly improved performance. Profiling and testing is really the recommended option. And then add indexes as required.
beach
true for the example given, but be aware... if the like argument has a a wildcard at the beginning, ( ... Like '%Bob') then a table scan is still going to happen.
Charles Bretana
+2  A: 

If you have an index on my_field, then LIKE may be faster. Do your own benchmarks.

AlexKuznetsov
A: 

I would profile both. But I would guess the 'LIKE' would be much faster, because it uses the binary search on the index (if the field is indexed). If you use the SUBSTR method, you will end up with a full table scan, as Oracle has to process row by row the function.

Manuel Ferreria
+4  A: 

Assuming maximum performance is the goal, I would ideally choose SUBSTR(my_field,1,6) and create a function-based index to support the query.

CREATE INDEX my_substr_idx
    ON my_table( substr( my_field,1,6 ) );

As others point out, SUBSTR(my_field,1,6) would not be able to use a regular index on MY_FIELD. The LIKE version might use the index, but the optimizer's cardinality estimates in that case are generally rather poor so it is quite likely to either not use an index when it would be helpful or to use an index when a table scan would be preferable. Indexing the actual expression will give the optimizer far more information to work with so it is much more likely to pick the index correctly. Someone smarter than I am may be able to suggest a way to use statistics on virtual columns in 11g to give the optimizer better information for the LIKE query.

If 6 is a variable (i.e. you sometimes want to search the first 6 characters and sometimes want to search a different number), you probably won't be able to come up with a function-based index to support that query. In that case, you're probably better off with the vagaries of the optimizer's decisions with the LIKE formulation.

Justin Cave
A: 

There's really two issues here:

  1. For which one will Oracle produce the more accurate cardinality and cost estimate?
  2. Which method is more flexible in terms of potential access methods?

This may vary by version, but both are pretty easy to test and that way you're sure that you have the best information for your version and your data.

Run execution plans for both queries using ...

explain plan for 
select ... from ... where my_field LIKE 'search%';

select * from table(dbms_xplan.display);

and

explain plan for 
select ... from ... where substr(my_field,1,6) = 'search';

select * from table(dbms_xplan.display);

You may see a difference in the execution plan, depending on the presence of indexes etc., but also compare the cardinality estimates with the actual result that you get from:

select count(*) from ... where my_field LIKE 'search%';

One of the two methods may be significantly more accurate than the other.

If neither of them is very accurate and this query is expected to run for a non-trivial amount of time then consider using dynamic sampling to improve the estimate, because with the wrong cardinality estimate the optimizer may chose a suboptimal access method anyway.

explain plan for 
select /*+ dynamic_sampling(4) */ ... from ... where substr(my_field,1,6) = 'search';

select * from table(dbms_xplan.display);

As far as index usage goes, both methods could use an index-based access method. The LIKE predicate is probably more index friendly and could use a range scan or a fast full index scan. The SUBSTR method can certainly use the fast full index scan, but whether the optimizer will consider a range scan is best tested on your own version -- my recollection is that it won't but who's to say that substr(my_column,1,n) won't be recognised as a special case, if not now then in the future?

David Aldridge
A: 

If you have no index than there is no difference. Because oracle is doing a full table scan and evaluates the expression for each row. You can put an index on the column to speed up both queries.

CREATE INDEX my_like_idx
ON my_table( my_field );

This index is more flexible and speeds up the query using like. It will work for any compare starting with characters and having placeholder (%) at the end. Oracle is doing a index range scan to find all matching rows.

CREATE INDEX my_substr_idx
ON my_table( substr( my_field,1,6 ) );

This index speeds up the query with substr. But the index is very special to compare only the first 6 characters.

If you query for a piece of starting in the middle. Creating a function based index will help.

WHERE substr(my_field,2,5) = 'earch'
WHERE my_field like '%earch%'
Christian13467