lucene

How to get Zend Lucene Range Search working properly (or help me debug)

I have an implementation of the Zend Search (Lucene) framework on my website that contains an index of products with prices. I am trying to allow customers to search for something, while contsraining the prices. Eg. Search for "dog food" between $5-$10 dollars. My search index looks like this: Keyword('name') Keyword('price') Lets sa...

What is the best Field Type/Encoding to store a number in a Zend Lucene Search Index?

Hello, How would I index a price int field in a Zend Lucene Search Index? I am currently using: $doc->addField(Zend_Search_Lucene_Field::Keyword('price', $price, 'utf-8')); Is this the correct way? Or should I be storing it specifically as a number somehow? ...

Help needed ordering search results

Hi, I've 3 records in Lucene index. Record 1 contains healthcare in title field. Record 2 contains healthcare and insurance in description field but not together. Record 3 contains healthcare insurance in company name field. When a user searches for healthcare insurance,I want to show records in the following order in search results....

Help needed bubbling up relevant records with most recent date

Hi, I've got 5 records in Lucene index. a.Record 1 contains--tax analysis.Date field value is March 2009 b.Record 2 contains--Senior tax analyst.Date field value is Aug 2009 c.Record 3 contains--Senior tax analyst.Date field value is July 2009 d.Record 4 contains--tax analyst.Date field value is Feb 2009 e.Record 5 contains--Senio...

lucene, or sql fulltext?

I want to create a search website to search docs (all kinds of formats including pdf), images, videos, and audio. I also want to be able to filter my search results based on some criteria like author name, date, etc. I'm doing this in .NET, so what's the easiest way to get up and running? SQL fulltext searching seems tempting becaus...

Zend_Search_Lucene query parsing problem

Here's the setup, I have a Lucene Index and it works well with the 2,000 documents I have indexed. I have been using Luke (Lucene Index Toolbox, v.0.9.2) to debug queries, and am using ZF 1.9. The layout for my Lucene Index is as follows: I = Indexed T = Tokenized S = Stored Fields: author - ITS category - ITS publication - ITS public...

Pylucene in Python 2.6 + MacOs Snow Leopard

Greetings, I'm trying to install Pylucene on my 32-bit python running on Snow Leopard. I compiled JCC with success. But I get warnings while making pylucene: ld: warning: in build/temp.macosx-10.6-i386-2.6/build/_lucene/__init__.o, file is not of required architecture ld: warning: in build/temp.macosx-10.6-i386-2.6/build/_lucene/__wrap0...

Lucene / Lucene.NET - Document.SetBoost() values???

I know it takes in a float, but what are some typical values for various levels of boosting within a result? For example: If I wanted to boost a document's weighting by 10% then I should set it 1.1? For 20% then 1.2? What happens if I start setting boosts to values like 75.0? or 500.0? Edit: Fixed Formatting ...

Lucene stop phrases filter

I'm trying to write a filter for Lucene, similar to StopWordsFilter (thus implementing TokenFilter), but I need to remove phrases (sequence of tokens) instead of words. The "stop phrases" are represented themselves as a sequence of tokens: punctuation is not considered. I think I need to do some kind of buffering of the tokens in the t...

Help needed figuring out reason for maxClauseCount is set to 1024 error

Hi, I've two sets of search indexes. TestIndex (used in our test environment) and ProdIndex(used in PRODUCTION environment). Lucene search query: +date:[20090410184806 TO 20091007184806] works fine for test index but gives this error message for Prod index. "maxClauseCount is set to 1024" If I execute following line just before execut...

Lucene - Searching several terms in different fields

I have a Lucene index which populates from a database. I store/index some fields and then add a FullText field in which I index the contents of all the other fields, so I can do a general search. Now let's say I have a document with the following two fields: fld1 - "Samsung releases a new 22'' LCD screen" fld2 - "Sony Ericsson phone's b...

Is the Lucene 2.9 TokenStream API faster than the old one?

I have been looking at upgrading from 2.4 to 2.9 and noticed all the contrived code that handles attributes. Just wondering if anyone has any opinions if this will change given its a .9 and things will change when 3.0 is out. I am confused how creating attributes by reflection and stashing attributes in a map can be as performant as jus...

Lucene query - "Match exactly one of x, y, z"

I have a Lucene index that contains documents that have a "type" field, this field can be one of three values "article", "forum" or "blog". I want the user to be able to search within these types (there is a checkbox for each document type) How do I create a Lucene query dependent on which types the user has selected? A couple of prere...

Solr DIH -- How to handle deleted documents?

I'm playing around with a Solr-powered search for my webapp, and I figured it'd be best to use the DataImportHandler to handle syncing with the app via the database. I like the elegance of just checking the last_updated_date field. Good stuff. However, I don't know how to handle deleting documents with this approach. The way I see it...

Retrieving per keyword/field match position in Lucene Solr -- possible?

Is there any way to retrieve the match field/position for each keyword for each matching document from solr? For example, if the document has title "Retrieving per keyword/field match position in Lucene Solr -- possible?" and the query is "solr keyword", I'd like to get, in addition to the doc-id (I normally only want the doc-id, not th...

Querying lucene tokens without indexing

I am using Lucene (or more specifically Compass), to log threads in a forum and I need a way to extract the keywords behind the discussion. That said, I don't want to index every entry someone makes, but rather I'd have a list of 'keywords' that are relevant to a certain context and if the entry matches a keyword and is above a threshold...

Does Zend Lucene support MultiValued Fields?

I wanted to know if Zend Lucene supports multivalued fields. I tried passing a an array to a field and it doesnt give any errors during indexing. But its not returning any results when i search. Any help is appreciated. ...

Compass Autocomplete to only return index words

I am currently trying to configure a compass query for autocomplete. I have it working so that the compass query will return an object. I would like to modify it so that it will return matching words in the index, not matching results. Thanks. ...

Why is the analyzer defined globally in Zend.Search.Lucene?

I just noticed that the Zend lucene implementation has a default analyzer that can be modified using Zend_Search_Lucene_Analysis_Analyzer::setDefault(), but I couldn't find a way to override that default when performing a query. Do I really need to reset the default analyzer if I'm working on multiple indexes or am I missing a function? ...

How do I delete old documents from Lucene/Lucene.NET

What is the idiomatic way to delete old documents from a Lucene Index? I have a date field (YYYYMMddhhmmss) on all of the documents, and I'd like to remove anything more than a day old (for example). Should I perform a filtered search or enumerate through the IndexReader's documents? I'm sure the question is the same regardless of whi...