tags:

views:

173

answers:

6

I am about to make a simple search facility on my website, where a user will enter around 2-4 keywords which will get searched in two columns in a table in my MS SQL database. One column is a varchar (50) called title and one column is a varchar(2500) called description. There will be about 20,000-30,000 records potentially at any one time to search.

The keywords will need to return "the best matches" - you know the kind you get on search pages like ebay that return the closest matches. The way I was thinking of doing this is seems kind of naive - I thought I can read all 30,000 records of the table into and object like this:

public class SearchableObject
{
    string Title {get; set;}
    string Description {get; set;}
    int MatchedWords {get; set;}
}

Then create a List of that object e.g List go through all 30,000 records, populate the List, find out the ones that match most times and return the top 10 using something like

 if Description.contains(keyword1);

But then find out how many times it occurs in the string to populate the MatchedWords field.

My question is, is this the best way to do this? If not, what would be?

+7  A: 

full-text index search will do the trick.

http://msdn.microsoft.com/en-us/library/ms142547.aspx

Henry Gao
Agreed. Create a full-text index on the table and they you can query it using something like this: select * from tablename where contains(*, 'word1 word2 word3')There are also more advanced ways to do this that will use stemming, return a match score, etc.
Justin Gallagher
thanks u all so much!!
David
+5  A: 

You should use a full text indexing solution. MS SQL Server 7 and later has a full text indexing engine built in (here's a decent overview article). You could also consider using external products such as Lucene (available for Java and C#/.NET).

iammichael
+2  A: 

i think you only want to use C# to parse the search parameters, not actually perform the searching and aggregation... So no, it's not really the best way. Use SQL Server to do the search heavy-lifting.

Paul Sasik
+1  A: 

take a look at lucene for .net, that will allow full index of your text.

http://incubator.apache.org/lucene.net/

the .net developers on this site may be able to tell you if there are any better alternatives

Karl
+1  A: 

If you're working with Java or C#, I'd recommend Lucene or Lucene.NET respectively.

Bryan Menard
+1  A: 

Use a full-text search engine such as Lucene. There exists also a .NET version.

Juha Syrjälä
why use an external search engine when an internal engine exists within SQL Server?
Raj More
Lucene and the SQL Server's built-in search engines have different feature sets and performance profiles for different data sets.
iammichael