tags:

views:

503

answers:

2

Hi, Its Ershad here.I am working on lucene. Now i am able to search the word.But if i type part of word, i am unable to get the results. Can you pls suggest what to be done.

For indexing, i am using the below code

writer = new IndexWriter(directory, new StandardAnalyzer(), true);
writer.SetUseCompoundFile(true);

doc.Add(Field.UnStored("text", parseHtml(html)));
doc.Add(Field.Keyword("path", relativePath));
writer.AddDocument(doc);

For searching, i am using the below code.

Query query = QueryParser.Parse(this.Query,"text",new StandardAnalyzer());

// create the result DataTable
this.Results.Columns.Add("title", typeof(string));
this.Results.Columns.Add("sample", typeof(string));
this.Results.Columns.Add("path", typeof(string));

// search
Hits hits = searcher.Search(query);

this.total = hits.Length();
+2  A: 

If you refer to the Lucene Query Parser Syntax documentation, you will find that you can append an asterisk (*) to the end of your query to match all those words that begin with a particular string. For example, suppose you want to get results mentioning both "caterpillar" and "catamaran". Your search query would be "cat*".

However, if you are not in direct control of the search query (for example, if the user is entering their own search queries), then you may need a little trickery on the part of the QueryParser. My experience is solely with the Java version of Lucene. Hopefully the principles are the same with Lucene.NET.

In Java, you could extend the QueryParser class and override its newTermQuery(Term) method. Traditionally, this method would return a TermQuery object. However, the child class would instead return a PrefixQuery. For example:

public class PrefixedTermsQueryParser extends QueryParser {

    // Some constructors...

    protected Query newTermQuery(Term term) {
        return new PrefixQuery(term);
    }

}

I am not terribly sure what methods you could override in Lucene.NET, but I am sure there must be something similar. Looking at its documentation, it appears the QueryParser class has a method called GetFieldQuery. Perhaps this is the method you would have to override.

Adam Paynter
Hi Adam,Thanks for the reply.Thanks for the documentation links.I will try this and reply to you.
+1  A: 

Hi Adam, This is really helpful to me. But i have one problem. This solution will give results only starting with search query. I want result which contains query string anywhere in the document. My code is in java and i donot have access to query. Can you suggest how can we search for document conating query as part of index for example if i have document with a indexed field name: value is virendra with your previous answer i can serach this document with query like "vir" or "viren". But if i have query "ire" I want the same result as it contains the string as a part of name. Please help me to figure out hits. Regards Viren

Append asterix at the start of query.
Arnis L.
And use comments not answers for details request. :)
Arnis L.
Hi Arnis,Thanks for your quick reply i tried it also but it didn't work.Please see my code [code]PrefixedTermsQueryParser parser = new PrefixedTermsQueryParser( CollectionUtils.toStrings(CardSearchField.ALL_USER_SEARCH_FIELDS),analyzer); Query luceneQuery = parser.parse(QueryParser.escape(getQuery()));[/code]If i give a query like "*ire*". It will parse it as "ire" only so no match will be there.I have tried so many things for it but didn't work out.Please see this and suggest me something.Regards
@virendra: Unfortunately, Lucene does not support wildcard characters at the beginning of the search text (http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Wildcard%20Searches)
Adam Paynter
@virendra: FYI: It appears you chose to "Add Another Answer" when you posted your original comment. In the future, you may want to click "add comment" underneath the answer you wish to comment on. I am notified when anyone comments on any of my answers. Because you instead created a new answer, I was not notified. I only discovered your "answer" because I was noticed one of my old answers was up-voted. :)
Adam Paynter
Hi adam,Thanks for your reply. So can you suggest me any other way to do it because it is my requirement. to search for any part of string as i gave an example in previous post.Here is one link where i think lucene search is working like this i don't know how they implemented this.http://www.lucenebook.com/search/p:lucene?q=Here is you enter any part of string it will give you search result.So i think there is some implementation which i am missing.Please help e in this matter.RegardsVirendra
@virendra: I tried searching for "borate" on that site after seeing the word "corroborate" mentioned. It did not return any results for me. Could you give me an example of a query you tried?
Adam Paynter
@adam, i searched for luc and got results in line contributions/lucli/lib README contributions/lucli/src/lucli Completer.I am confused with it .May be this example can make it more clear.
@virendra: In your example, their analyzer (most likely StandardAnalyzer) is cutting "contributions/lucli/lib" up into three tokens: "contributions", "lucli" and "lib". When you search for "luc", Lucene is matching "luc" as the prefix to "lucli".
Adam Paynter
Yup this seems right. Thanks for your time adam. I am really thankful to you.
You're welcome! Sorry that I couldn't solve the original problem, though...
Adam Paynter