scoring

In Lucene how do terms get used in calculating scores, can I override it with a CustomScoreQuery?

Has someone successfully overridden the scoring of documents in a query so that the "relevancy" of a term to the field contents can be determined through one's own function? If so, was it by implementing a CustomScoreQuery and overriding the customScore(int, float, float)? I cannot seem to find a way to build either a custom sort or a cu...

Order sets of numbers for maximum distance

You have (up to 100) distinct sets of (2-4) numbers. The order of the sets or numbers in the sets does not matter. The highest number relates to the number of sets and goes up to 30. Like: {1 2 3 4} {1 2 3 5} {1 2 3} {1 2 4 5} {6 2 4} {6 7 8 9} {6 7 9} {7 8 9} {2 4 8 9} The goal is, to arrange these sets in a particular order, where tw...

Problem with Lucene scoring

I have a problem with Lucene's scoring function that I can't figure out. So far, I've been able to write this code to reproduce it. package lucenebug; import java.util.Arrays; import java.util.List; import org.apache.lucene.analysis.SimpleAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; im...

Different lucene search results using different search space size.

I have an application that uses lucene for searching. The search space are in the thousands. Searching against these thousands, I get only a few results, around 20 (which is ok and expected). However, when I reduce my search space to just those 20 entries (i.e. I indexed only those 20 entries and disregard everything else...so that deve...

Boost Solr results based on the field that contained the hit

Hi, I was browsing the web looking for a indexing and search framework and stumbled upon Solr. A functionality that we abolutely need is to boost results based on what field contained the hit. A small example: Consider a record like this: <movie> <title>The Dark Knight</title> <alternative_title>Batman Begins 2</alternative_title...

Lucene document Boosting

Hello, I am having problem with lucene boosting, Iam trying to boost a particular document which matches with the (firstname)field specified I have posted the part of the codeenter code hereprivate static Document createDoc(String lucDescription,String primaryk,String specialString){ Document doc = new Document(); doc.add(new Field...

Algorithm for scoring user activity

I have an application where users can: Write reviews about products Add comments to products Up / Down vote reviews Up / Down vote comments Every Up/Down vote is recorded in a db table. What i want to do now is to create a ranking of the most active users in the last 4 weeks. Of course good reviews should be weighted more than good ...

How can I merge multiple Compass Resources into one, with one score?

I am trying to integrate compass into my platform using the JDBC ResultSetToResourceMapping. What I want to do is set it up so that I could have multiple result set mappings, tied to one Resource, that produces one result, with one score, and a merged score. I have tried to trick Compass into doing this by mapping the same id across th...

lucene vs solr scoring

Can some one explain (or quote a reference) to compare the scoring mechanism used by SOLR and LUCENE in simpler words. Is there any difference in them; I am not that good at solr/lucene but my finding showed as if they are different. P.S: i just tries a simple query like "+Contents:risk" and didn't use any filter other stuff. ...

Frequencies of lucene unigrams and bigrams

Hi! i am storing in lucene index ngrams up to level 3. When I am reading the index and calculating scoring of terms and ngrams I am obtaining results like this TERM FREQUENCY.... TFIDF minority 25 16.512926 minority report 24 16.179296 report 27 13.559037 cruise ...

How to define a boost factor to each term in each document during indexing?

I want to insert another score factor in Lucene's similarity equation. The problem is that I can't just override Similarity class, as it is unaware of the document and terms it is computing scores. For example, in a document with the text below: The cat is in the top of the tree, and he is going to stay there. I have an algorithm of ...