views:

38

answers:

2

Here's the situation, hugely grateful for any input anyone may have:

I have a table of about 100m rows, with (among several other fields), a text field that we can call 'product'. The product field is about 200 chars long, and contains details of, yes, products.

I want to give users the ability to search through these items by entering a search string. In the simplest case, which may be enough, it would be as if I were executing "and where product like '%searchString%'" in my sql.

The good news is that although there's 100m rows, any given user is only interested in their own rows, of which there are far fewer, with the largest user maybe having 2m rows in the table.

There will be other parts to their queries, for example "where price > 100 and product like '%searchString%'".

Is doing 'like' my best bet, or are there some good tools either in Oracle (the db I'm using) or perhaps some external indexing tool. I've seen mention of Oracle Text, but don't know much about it.

The problem is a little harder than just doing a simple indexing of all the keywords appearing in the product name, because they may also search on, say, part of a SKU. So the product name may include the SKU DFR45G6TY and they just want to put in 'DFR' to pick it out and not have to enter the full SKU string. (If I have to miss out this functionality, it may still be OK..)

Can anyone point me in the right direction for how I should tackle this? Resources, tips, ideas, products, anything gratefully received!

Thanks all!

+1  A: 

I wouldn't discount Oracle Text - read up on the various configuration options, SUBSTRING_INDEX looks promising - this page should prove useful.

Will A
Thanks. This link looks good.
Bruce
+1  A: 

Oracle Text does feel like the right way to go, although for only 200 bytes in one table it may be overkill.

If the most likely thing is for them to enter the left-hand side of the product name, then a plain indexed on the field should be fine. It will still bring back all DFR% products, which may be tens of thousands, but there is nothing you can do to control that (except perhaps warning the user if they enter a 'short' search it may take some time).

If the SKU isn't at the start of the product string, then you could create an index on the SKU rather than whole product.

SUBSTR(product_name,<offset to start of SKU string>

On the other hand, if you need to handle a range of arbitrary searches, either you are going to have to dernormalise all the components of your product name string off into a another table or use Oracle Text.

JulesLt
Thanks for the great answer!
Bruce