lucene

Is it possible to re-generate Lucene index in background?

Hi there, Sometimes there is need to re-generate a lucene index, e.g. when something changes in the Compass mapping or in the way boosts are applied, or if something went corrupt for whatever reason. In my case, generation of the index takes about 5 to 6 hours, clearing the index before leads to data not being complete for this interva...

Problem with Solr dynamic/copy Field.

hi All I have a problem that i have a dynamic field in schema.xml as <dynamicField name="sec_*" type="text" indexed="true" stored="false"/> and <field name="Contents" type="text" indexed="true" stored="false" multiValued="true"/> dynamic field is copied to Contents field as <copyField source="sec_*" dest="Contents"/> now when i p...

get all results with Dismax , like q=*:* ?

hi all , is not possible to do sthe like q=: with DisMax ? thanks ! ...

Is it possible to get the matching document and all its ancestors in one query?

To illustrate my requirements consider the following directory structure: C:\Dev C:\Dev\Projects C:\Dev\Projects\Test Project C:\Dev\Projects\Test Project\Test.cs C:\Dev\Projects\Foo C:\Dev\Projects\Foo\foo.cs (containing the word test) The basic document will have id, type, name and content fields, where type will be file or folder...

Lucene Boolean value search with boolean query

Hi, There is a field say XYZ with value as either TRUE or FALSE. i am searching as following +Contents:risk +XYZ:TRUE is it legal to search like that? i tried but it showed me results with FALSE value too. What was more amazing is that i searched by +XYZ:[TRUE TO TRUE] and it worked. can some one tell me what exactly is my mistak...

find similar documents before add

User fill multi-field form (document) with date, time, title and description. Check, if similar documents are stored in Solr before document saved User can choose, save this document or not. How to implement in Solr "find similar documents"? in Lucene: FuzzyLikeThisQuery, MoreLikeThis? but in Solr? P.S. I use django-hastack ...

Lucene.NET: Camel case tokenizer?

I've started playing with Lucene.NET today and I wrote a simple test method to do indexing and searching on source code files. The problem is that the standard analyzers/tokenizers treat the whole camel case source code identifier name as a single token. I'm looking for a way to treat camel case identifiers like MaxWidth into three tok...

Indexing file paths or URIs in Lucene

Some of the documents I store in Lucene have fields that contain file paths or URIs. I'd like users to be able to retrieve these documents if their query terms contain a path or URI segment. For example, if the path is C:\home\user\research\whitepapers\analysis\detail.txt I'd like the user to be able to find it by queriying for path...

Get total record count for a query in zend lucene search?

HI I have used "setResultSetLimit(1000)" method to limit results to 1000 records. The good thing is It helps to save server resources, but there is noway to get full record count for a query. Is any one know how to get full hit count? TX ...

Ordering results by relevance using Solr search

I'm new to Solr search and trying to get a grasp on how to handle ordering of results. I'm using Ruby on Rails together with the Sunspot gem to interface with Solr. I have an Article model, that has the following fields that are indexed: text Title text AuthorNames integer NumberOfReviews I'd like to be able to perform a search on So...

Solr/Lucene is it possible to order first by relevance, and then by a second attribute?

In Solr/Lucene is it possible to order first by relevance, and then by a second attribute? As far as I can tell if I set an ordering parameter, it totally overrides relevance, and sorts by the ordering parameter(s). How can I have results sorted first by relevance, and then in the case of two entries with exactly the same relevance, gi...

Find all Lucene documents having a certain field

I want to find all documents in the index that have a certain field, regardless of the field's value. If at all possible using the query language, not the API. Is there a way? ...

Deleting document by Term from lucene

The following code does not delete the document by Term as expected: RAMDirectory idx = new RAMDirectory(); IndexWriter writer = new IndexWriter(idx, new SnowballAnalyzer(Version.LUCENE_30, "English"), IndexWriter.MaxFieldLength.LIMITED); ...

How reliable is Lucene when counting hits/documents?

If a run a search: "+house +car" and returns 5,343,562 hits Is that the exact number of documents I have, or it's an approximation. If it's an approximation, is there a way to make it to return the extract number of documents that qualifies for a search query? ...

How can I retrieve non-stored Lucene field values?

Hi! When searching, only stored fields are returned from a search. For debugging reasons, I need to see the unstored fields, too. Is there a way via the API? Thanks! P.S.: I know Luke, unfortunately I can't use it in my case. ...

SOLR/Lucene index return unique results

I have an index that has multiple entries for the exact same item, i specified a : <uniqueKey>citation</uniqueKey> based on citation, a field that i can use to determine it is unique in the index. I was wondering if there is some way to adjust the query so that it will only return unique results based on that field. or rather to del...

Keeping Sitecore Lucene Indexes Up-To-Date

I've got a Sitecore application, which creates and uses a number of Lucene indexes through Sitecore's built-in API. I need to make sure that items in the index are kept up-to-date when they are published. To do this, I've created a Sitecore Hook that detects when and item is saved to the "web" database and reindexes the item. It seems ...

Debugging Solr search queries on Sunspot

How can I debug Solr search queries when using the Sunspot gem on Rails? I have some queries that are returning bizarrely high scores, and I'm trying to get to the bottom of why this is happening. It doesn't seem like any debugging information is exposed to Sunspot, so I think that I need to debug through Solr directly. Fortunately, So...

Weird query behavior need some help debugging this.

Here is the interresting part of the schema : <fieldType name="text_rev" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filt...

Lucene: What is the difference between Query and Filter

Lucene query vs filter? They both does similar things like termquery filters by term value, filter i guess is there for similar purpose. When would you use filter and when query? Just starting on lucene today so trying to clear concept ...