ansaurus

Question

Answer 1

+1 A:

You have to get Lucene in Action. Although about original (that is Java) Lucene implementation, it contains all the information you need: about boosts, highlighters, qwery parsers, etc.

Anton Gogolev 2009-02-10 13:50:36

If this turns out to be the solution I think it is, I will dftly consider getting myself more Lucene resources. It looks like this is going to replace my entire search algorithm so far. And I don't mind :)

borisCallens 2009-02-10 14:17:10

Answer 2

+2 A:

We do something similar, the trick is to specify fields in your query string:

(+Tier1:ribbon^1)^4 OR (+Tier2:ribbon^1)^4 OR (+Tier3:ribbon^1) OR (+Tier4:q*ribbon*^1)^12

In the above example, the user searched for "ribbon" in our application. We have different segments of data in different fields, and the final field "Tier4" contains all the previous terms concatenated together. We prepend the field with a "q", so we can do leading wild-cards, also:

(+Tier4:q*ribbon*^1)^12

Lastly, we use boosts with the caret (^). This ends up weighting things differently. It took a while to get boosts right, and I'm still not 100% happy with them, but they do make a big impact.

Bob King 2009-02-10 13:51:37

so if it says (+Tier1:ribbon^1)^4 this means, look in field Tier1 for the word ribbon and give the result of this a weight of 4?Do you have an easy resource on how to create query strings?

borisCallens 2009-02-10 14:18:30

It's silly that leading wild cards need a trick like the prepended character. Any idea why?

borisCallens 2009-02-10 14:23:52

We had to go to the Java documentation to get the query string information. Also, be careful with lots of terms. You may need to call .setMaxClauseCount() otherwise an exception can be thrown.

Bob King 2009-02-10 14:52:25

I remember reading about the wild card issue and I didn't really buy the reasons they were giving.

Bob King 2009-02-10 14:53:28

Answer 3

+1 A:

I don't think you need to maintain an "all" field.

Have a look into using a "MultiFieldQueryParser". Rather than taking a single default field to be used by the query parser, it accepts an array of field names (in addition to the index analyser).
Term boost should work as per "QueryParser" (i.e. no special action required). I should add that I've found the standard scoring seems OK for me (length of field, number of matches etc) without using boosted terms.
Lucene.Net (well, certainly the SVN 2.3 builds at the moment) includes a port of the Highlight package from the Java source. It does have a couple of quirks (not least of which is that it can be tricky to get going in the first place), but it basically works.

Good luck

Moleski 2009-03-01 23:54:36

I will have a look at the MultiFieldQueryParser. Thanks

borisCallens 2009-03-02 10:18:37

It seems that using the MultiFieldQueryParser creates a query where my terms have to exist in ALL the queried fields. Can I change this somehow?

borisCallens 2009-03-02 14:55:35

Since there is no PM function here, do you have any suggestions for me regarding the highlight package before I start implementing it?

borisCallens 2009-03-02 15:28:39

1) You're welcome!2) Have you tried a simple test case for verification of this?3) The code here (http://stackoverflow.com/questions/189366/lucene-net-search-result-to-highlight-search-keywords) looks pretty good, but don't forget to use query rewrite before calling this.

Moleski 2009-03-03 00:11:53

PM? What's that stand for?

Nick 2009-03-03 17:00:59

I'm going to guess "personal message".

Moleski 2009-03-05 09:32:04

ansaurus

tags:

views:

answers:

How to get more out of Lucene.net

related questions