views:

46

answers:

2

More and more, I'm seeing searches that not only find a substring in a specific column, but they appear to search in all columns. An example is in Amazon, where you can search for "Arnold" and it finds both the movie Running Man starring Arnold Schwarzeneggar, and the Gund toy Arnold the Snoring Pig. I don't know what the term is for this type of search (Wide search? Global search?), and that bugs me. But what I really want to know is what is the normal pattern for accomplishing this type of search in a QUICK way.

The obvious, and slow, way to do it would be to search for the substring "Arnold" in the title, "Arnold" in the author, "Arnold" in the description, etc.

The first quick solution that comes to mind is to store a mapping for each word used to describe a product to the product itself, and then search that word mapping. That could be quick, but doesn't seem very space-efficient to me.

There are probably a hundred ways to accomplish this, some of which probably don't even use a database. But what is the norm?

+1  A: 

I've done this in the past by storing an XML version of items in an XML column in the table, then searching in that column instead of the others.

David Moye
That's one idea. What kind of index would you use on that column to avoid a full index scan when searching for LIKE '%Arnold%'?
Mike M. Lin
In general, I would enable fulltext on that table, then add that column to the fulltext index and do searching that way.
David Moye
+1: I like this idea. So "full-text search" is the proper term for this, or "Oracle Text" in the Oracle world.
Mike M. Lin
+1  A: 

Maybe they're not storing the data the way you expect.

They could, for example, store all titles, authors, descriptions, and every other searchable field in one table with an attribute to distinguish the field's type.

Beth
That's the most likely in my opinion, and similar to the "first quick solution that comes to mind" in the question. Can you say for a fact that some major implementations are done this way? If so, please point us to one.
Mike M. Lin