tags:

views:

90

answers:

2

I am currently working on an search application which uses Lucene.Net to index the data from the database to Index file. I have a product catalog which has Name, short and long description, sku and other fields. The data is stored in Index using StandardAnalyzer. I am trying to add auto suggestion for a text field and using TermEnum to get all the keyword terms and its score from the Index. But the terms returned are of single term. For example, if I type for co, the suggestion returned are costume, count, collection, cowboy, combination etc. But I want the suggestion to return phrases. For exmaple, if I search for co, the suggestions should be cowboy costume, costume for adults, combination locks etc.

The following is the code used to get the suggestions:

public string[] GetKeywords(string strSearchExp)
{

IndexReader rd = IndexReader.Open(mIndexLoc);
TermEnum tenum = rd.Terms(new Term("Name", strSearchExp));
string[] strResult = new string[10];
int i = 0;
Dictionary<string, double> KeywordList = new Dictionary<string, double>();
do
{
    //terms = tenum.Term();
    if (tenum.Term() != null)
    {
        //strResult[i] = terms.text.ToString();
        KeywordList.Add(tenum.Term().text.ToString(), tenum.DocFreq());
    }
} while (tenum.Next() && tenum.Term().text.StartsWith(strSearchExp) && tenum.Term().text.Length > 1);

var sortedDict = (from entry in KeywordList orderby entry.Value descending select entry);

foreach (KeyValuePair<string, double> data in sortedDict)
{
    if (data.Key.Length > 1)
    {
        strResult[i] = data.Key;
        i++;
    }
    if (i >= 10)    //Exit the for Loop if the count exceeds 10
        break;
}
tenum.Close();
rd.Close();
return strResult;

}

Can anyone please give me directions to achive this? Thanks for looking into this.

A: 

As you said, "the terms returned are of single term". So you need to create terms that consist of phrases.

You can use the built-in ShingleFilter token filter to create your phrase terms:

http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/shingle/ShingleFilter.html

You may want to use a separate field for this as I'm not sure whether ShingleFilter actully produces single terms - you'll probably want to experiment with this.

KenE
Thanks Ken. I am working on Lucene.net and not sure of the equivalent filter to ShingleFIlter in .Net. Let me try and find out.
eric
A: 

You could simply index your product name in a different field using the Field.Index.NOT_ANALYZED parameter or the KeywordAnalyzer, and then run either a wildcard query or a prefix query on it.

Jf Beaulac