views:

108

answers:

3

I've looked into SQL 2008's built-in Full-Text search, and also Lucene.NET.. but I don't think they'll do what I need to do. And I just want to make sure I'm building my program as efficient as possible.

So here's the dream. I want to have a single textbox on a page (like google) and allow the user to enter ANYTHING in. And based on their text, I will search 10's of tables to find what they're looking for.

Example. My database contains thousands of locations, each of which have multiple names / codes. Within each location, there is tonnes of data associated with them.

So if the user wants to display all the locations with the codes that contain "VM" ("CD-VM01", "CD-VM02", "CD-VM03", etc).. they should be able to. Or if they want to find all the locations in Toronto, they just type Toronto.. I want to make the search as easy as possible for people. (I've found that people don't like thinking)..

Plus it ends up being easier to scale to more search options if I can just search the database, and not have to add new fields to a search screen.

So if I don't use Full Text search (which I can't for partial) the only thing I can see that i'm left with is "Like" .. is that right? is that my only option?

A: 

I guess the question is, even if you were able to do this in the database, how would you handle it in the UI?

Most likely every search result from a different table will have different attributes that need to be displayed in order for the end user to understand what it is.

The Google search box only needs to search one thing - the content of web pages - and return one type of result - web page URLs and excerpts. Fundamentally you are trying to search for many different things, and so you'll most likely need to handle each case separately.

Alternatively, you could maintain a denormalized search table that contains only the search text and the common attributes you think need to be displayed with each hit. Maintain it either with a scheduled task or with triggers. You'd be able to use FTS on this as well.

Update

Some of the comments express some uncertainty over what SQL Server Full-Text Search is capable of. FTS can most definitely search for a single string anywhere within the text of a column, and can do other things as well (proximity search, free-text search, etc.) If you're just getting started then I'd recommend the TechNet pages on the subject, the documentation is very comprehensive.

In particular I'd suggest having a look at the section on Configuring Catalogs and the Getting Started page (Cole's Notes: you have to create catalogs - writing CONTAINS queries without them won't get you very far). Then take a look at the querying page. I'd be very surprised if you can't find answers to any and all of your questions there.

If you still can't get it to work, I would post a new question with the specifics of your problem - what you've tried, what you're expecting, and what's happening instead.

Aaronaught
I see what you're saying, and after thinking about it I realize I'm wasting the user's time if I search for things they're not interested in example: the user types in "0008" which to them is a partial site code, but I wouldn't know the difference and I could find serial numbers, reference numbers, etcI figure I could have a dynamic list of options they can search for, and that's fine. But what I found is that FTS wouldn't help me with a partial search.if the user wants to show All the codes that contain "VM000" in codes like (CMVM0001, CMVM0002, CMVM0003) I want to let them do that
Dan
@Dan: I confess to not being sure exactly what you mean when you say that FTS can't do a partial search as per your example. Isn't that exactly what the `CONTAINS` and `FREETEXT` keywords do?
Aaronaught
Looking at this: http://en.wikipedia.org/wiki/SQL_Server_Full_Text_Search and from my research, FTS can only do "Prefix Term Searches" and not search in the middle of words.
Dan
@Dan: Prefix searches are an *additional* feature of FTS. The default behaviour is "contains." You only get a prefix search if you specifically ask for one. FTS can definitely find tokens in the middle of words (and much more), that is the main reason for using it!
Aaronaught
So you're saying if I wanted to find "OHIO" and I just typed "HI" with the command contains(ColomnName, 'HI') it should find all the words with "HI" in it? unless I'm doing something wrong, it's not working for me. Do you know of any sites that specifically talk about doing this using FTS?
Dan
@Dan: Indeed that is what I'm saying. I updated my answer with a few links that I hope will help you. My guess is that it's a catalog configuration issue, but I'd have to know what you mean by "not working."
Aaronaught
So I saw your update.. And I looked through those pages. And I did already create a Full Text Catalog. And I've done my research on the queries. Can you take a look at this link: http://www.eggheadcafe.com/software/aspnet/29627308/fts--how-to-search-by-pa.aspx and let me know if you agree or disagree with "Daniel"'s comment: " You cannot use FTS to search for word parts in the middle or at the end of words, only the start."
Dan
I also found this link: http://social.msdn.microsoft.com/Forums/en/sqlsearch/thread/f5f34d45-5880-449d-9626-b94360eaffd4 in which "Hilary Cotter" says: " there is no way to search inside words/numbers the way you wish other than by using a like operator." Are you saying they are both incorrect? and you know of a way to do this NOT using like?
Dan
@Dan: I apologize for my misunderstanding, I should have paid more attention to the specific examples - you're right, FTS cannot search for part of a single word, the word is the smallest level of granularity available. So you'll have to use Lucene or `LIKE` for that.
Aaronaught
A: 

I believe Lucene does exactly what you're looking for. You can add an index from any external data source (including multiple database tables), then query that index and you'll get back pointers to the matching records.

The drawback is that unlike with full-text indexing, you're responsible for building and maintaining the index yourself.

You can see an example of how Lucene.NET might be used.

Josh Kodroff
hmm.. Do you know of any specific sites that talk about this directly. It's hard to find the EXACT information I'm looking for. I'd hate to start working on it, just to find out it does the same thing as FTS.
Dan
even looking at this: http://lucene.apache.org/java/2_2_0/queryparsersyntax.html it says "You cannot use a * or ? symbol as the first character of a search" which would imply that if I wanted to find "CMVM0008" I couldn't search for "VM0008"
Dan
I was surprised that there weren't oodles of examples out there when I was searching for stuff. Added link to answer.
Josh Kodroff
A: 

It appears that the easiest / quickest solution for this exact problem would be to use LIKE.

Dan