lucene

Controlling Solr score/sort

I want to filter a property within a range, but items that does not have the property should come last in the result. My solution was to set it to -1 if the property was not set. +(property:[10000000001 TO 10000000019] property:"-1"^0.5) This doesn't work, since every document with property:-1 get a very high score, for some reason. I...

Detailed information in Lucene/Solr results

After having performed a search in Lucene/Solr without having specified a field, how can I know in which fields of a result document the search string was found (and how often)? ...

How can I merge multiple Compass Resources into one, with one score?

I am trying to integrate compass into my platform using the JDBC ResultSetToResourceMapping. What I want to do is set it up so that I could have multiple result set mappings, tied to one Resource, that produces one result, with one score, and a merged score. I have tried to trick Compass into doing this by mapping the same id across th...

Lucene setboost doesn't work

Hi all, OUr team just upgrade lucene from 2.3 to 3.0 and we are confused about the setboost and getboost of document. What we want is just set a boost for each document when add them into index, then when search it the documents in the response should have different order according to the boost I set. But it seems the order is not chang...

Search for short words with SOLR

I am using SOLR along with NGramTokenizerFactory to help create search tokens for substrings of words NGramTokenizer is configured with a minimum word length of 3 This means that I can search for e.g. "unb" and then match the word "unbelievable". However I have a problem with short words like "I" and "in". These are not indexed by SOL...

Where can I find open source applications that use lucene.net

I am looking for any open source application that uses lucene.net. I am working on a complicated web application and would like to see how others have implemented lucene.net. ...

Compass - Lucene Full text search. Structure and Best Practice.

Hi, I have played about with the tutorial and Compass itself for a bit now. I have just started to ramp up the use of it and have found that the performance slows drastically. I am certain that this is due to my mappings and the relationships that I have between entities and was looking for suggestions about how this should be best done...

Can I store and join based on external attributes in Lucene/Solr

Is there a way to store information about documents that are stored in Lucene such that I don't have to update the entire document to update certain attributes about the documents? For instance, let's say I had a bunch of documents, and that I wanted to update a permissions list of who was allowed to see the documents on a daily, or m...

Using Lucene Highlighter along with MultiFieldQueryParser

Im using Lucene Highlighter to highlight the matches that I have found in a Lucene Index. Now, my problem is that If I have to search multiple fields of a document, and I need to display the matching text, then how can I get in which field the hit has occurred? The code which I am using for the highlighter is basically the second functi...

Solr/Lucene Scorer

We are currently working on a proof-of-concept for a client using Solr and have been able to configure all the features they want except the scoring. Problem is that they want scores that make results fall in buckets: Bucket 1: exact match on category (score = 4) Bucket 2: exact match on name (score = 3) Bucket 3: partial match on cat...

Indexing different type of Entities/Objects with Solr Lucene

Let's say I want to index my shop using Solr Lucene. I have many types of entities : Products, Product Reviews, Articles How do I get my Lucene to index those types, but each type with different Schema ? ...

Zend_Search_Lucene vs SOLR

Hi, I have recenlty stumbled into Zend Lucene port of Lucene project. I have a little bit experience with SOLR so I would like to know what is the difference between two of them especially from performance and installation side. As much as I know SOLR requires Tomcat serverlet running in web hosting in order to work, what about Zend L...

Extracting text from PDF, DOC, HTML after crawling with Heritrix

I'm looking to use Heritrix to crawl web-sites. I'm wondering what tools Heritrix users are using to extract text from crawled files prior to indexing them with Lucene. ...

A little off topic, but can anyone recommend examples where lucene is used on live websites

I know wikipedia uses it but I am looking for more product based websites. Thanks. ...

SOLR - how to remove logically deleted documents?

I am implementing SOLR for a free text search for a project where the records available to be searched will need to be added and deleted on a large scale every day. Because of the scale I need to make sure that the size of the index is appropriate. On my test installation of SOLR, I index a set of 10 documents. Then I make a change in ...

Lucene: Question of score caculation with PrefixQuery

Hi, I meet some problem with the score caculation with a PrefixQuery. To change score of each document, when add document into index, I have used setBoost to change the boost of the document. Then I create PrefixQuery to search, but the result have not been changed according to the boost. It seems setBoost totally doesn't work for a Pre...

Lucene real-time indexing?

What is the best way to achieve Lucene real-time indexing? ...

Working with Katta ( Lucene, Hadoop )

Can any one provide me with some sample Java code as how to go about storing the Lucene index in a HDFS( Hadoop File Sytem ), using Katta. ...

Lucene 3: Where is StandardAnalyzer?

I am working on Lucene 3.x (source code). To start with I downloaded latest source code from SVN stable code 3.0.2 from: http://www.apache.org/dyn/closer.cgi/lucene/java/ The second one has source files for package org.apache.lucene.analysis.standard, however the first one does not have any such files (not even the package). Somewher...

Use Lucene Hits to Filter DataSet Bound to ListView in WPF C#?

Alright so I've got a ListView with many items available to it any time (in virtualized mode). Right above the ListView is a text box that allows the user type in any search term and the ListView will be filtered live. The ListView is currently bound to a DataSet like this: SoundListView.DataContext = DS.Tables[0].DefaultView; The Dat...