views:

352

answers:

6

I have a column in a non-partitioned Oracle table defined as VARCHAR2(50); the column has a standard b-tree index. I was wondering if there is an optimal way to query this column to determine whether it contains a given value. Here is the current query:

SELECT * FROM my_table m WHERE m.my_column LIKE '%'||v_value||'%';

I looked at Oracle Text, but that seems like overkill for such a small column. However, there are millions of records in this table so looking for substring matches is taking more time than I'd like. Is there a better way?

+2  A: 

No.

That query is a table scan. If v_value is an actual word, then you may very well want to look at Oracle Text or a simple inverted index scheme you roll your on your own. But as is, it's horrible.

Will Hartung
Technically, the query will involve a *FULL* scan, but not necessarily of the table; it could do an index fast full scan, if a suitable index exists. In which case the index scan may be quicker than a FTS, especially if there are a lot of NULLs in the column.
Jeffrey Kemp
+1  A: 

You have three choices:

The simplest way to redefine the problem is to say the column has to start with the search term (so lose the first %), which will then use the index.

An alternative way is to say that the search starts on word boundaries (so "est" will match "estimate" but not "test"). MySQL (MyISAM) and SQL Server have functions that will do matching like this. Not sure if Oracle does. If it doesn't you could create a lookup table of words to search instead of the column itself and you could populate that table on a trigger.

cletus
+1  A: 

You could put a function-based index on the column, using the REGEXP_LIKE function. You might need to create the fbi with a case statement to return '1' with a match, as boolean returning functions dont seem to be valid in fbi.

Here is an example.

Create the index:

CREATE INDEX regexp_like_on_myCol ON my_table (
      CASE WHEN REGEXP_LIKE(my_column, '[static exp]', 'i') 
           THEN 1
           END);

And then to use it, instead of:

SELECT * FROM my_table m WHERE m.my_column LIKE '%'||v_value||'%';

you will need to perform a query like the following:

SELECT * FROM my_table m WHERE (
      CASE WHEN REGEXP_LIKE(m.my_column, '[static exp]', 'i')
           THEN 1
           END) IS NOT NULL;

A significant shortcomming in this approach is that you will need to know your '[static exp]' at the time that you create your index. If you are looking for a performance increase while performing ad hoc queries, this might not be the solution for you.

A bonus though, as the function name indicates, is that you have the opportunity to create this index using regex, which could be a powerful tool in the end. The evaluation hit will be taken when items are added to the table, not during the search.

akf
+1  A: 

You could try INSTR:

...WHERE INSTR(m.my_column, v_value) > 0

I don't have access to Oracle to test & find out if it is faster than LIKE with wildcarding.

OMG Ponies
+2  A: 

Oracle Text covers a number of different approaches, not all of them heavyweight. As your column is quite small you could index it with a CTXCAT index.

SELECT * FROM my_table m 
WHERE catsearch(m.my_column, v_value, null) > 0
/

Unlike the other type of Text index, CTXCAT indexes are transactional, so they do not require synchronisation. Such indexes consume a lot of space, but that you have to pay some price for improved performance.

Find out more.

APC
+1  A: 

For the most generic case where you do not know in advance the string you are searching for then the best access path you can hope for is a fast full index scan. You'd have to focus on keeping the index as small as possible, which might have it's own problems of course, and could look at a compressed index if the data is not very high cardinality.

David Aldridge