lucene

Using stop words with WhitespaceAnalyzer

Lucene's StandardAnalyzer removes dots from string/acronyms when indexing it. I want Lucene to retain dots and hence I'm using WhitespaceAnalyzer class. I can give my list of stop words to StandardAnalyzer...but how do i give it to WhitespaceAnalyzer? Thanks for reading. ...

Closing indexreader

I've a line in my Lucene code: try { searcher.GetIndexReader(); } catch(Exception ex) { throw ex; } finally { if (searcher != null) { searcher.Close(); } } In my finally clause, when I execute searcher.Close(), will it also execute searcher.GetIndexReader().Close behind the scenes? Or do I need to explicit...

Using IndexReader IsLocked and Unlock methods

Before calling AddDocument() on IndexWriter, is it ok if i call IndexReader.IsLocked(myDirectory) and if it returns true then call, IndexReader.Unlock(myDirectory)?? Please suggest. ie... if(IndexReader.IsLocked(myDirectory)) { IndexReader.Unlock(myDirectory); } writer = new IndexWriter(myDirectory, _analyzer, true); writer.AddDocument...

How to find related items by tags in Lucene.NET

My indexed documents have a field containing a pipe-delimited set of ids: a845497737704e8ab439dd410e7f1328| 0a2d7192f75148cca89b6df58fcf2e54| 204fce58c936434598f7bd7eccf11771 (ignore line breaks) This field represents a list of tags. The list may contain 0 to n tag Ids. When users of my site view a particular document, I want to dis...

Zend_Search_Lucene on Leopard: problem

Leopard 10.5.6 Macbook Zend 1.6, Apache 2, PHP 5.2.5 I cannot seem to do indexing, using Zend_Search_Lucene api. Building or opening indices on generates the following exception message: string(30) "Wrong segments.gen file format" However, the indices/segments files were scp from a working version of my site and I've chmoded them all...

Nested prohibit/require operators in Lucene search queries

I am using Lucene for Java, and need to figure out what the engine does when I execute some obscure queries. Take the following query: +(foo -bar) If I use QueryParser to parse the input, I get a BooleanQuery object that looks like this: org.apache.lucene.search.BooleanQuery: org.apache.lucene.search.BooleanClause(required=true,...

Exact phrase search using Lucene.net

I am having trouble searching for an exact phrase using Lucene.NET 2.0.0.4 For example I am searching for "scope attribute sets the variable" (including quotes) but receive no matches, I have confirmed 100% that the phrase exists. Can anyone suggest where I am going wrong? Is this even supported with Lucene.NET? As usual the API...

Is there any recommended IndexSearcher method?

I'm using Lucene search API in a web based application. Which method of Lucene's IndexSearcher class is recommended to use?Is any method faster than other? 1.IndexSearcher(Directory directory) 2.IndexSearcher(IndexReader r) 3.IndexSearcher(String path) Thanks for reading. ...

Reusing IndexSearcher

Hi, Am using Lucene in a web based application and want to reuse the same instance of Indexsearcher for all the incoming requests. Does this logic(using C#) make sense?Please suggest. DateTime lastWriteTime = System.IO.Directory.GetLastWriteTime(myIndexFolderPath); if (HttpRuntime.Cache["myIndexSearcher"] == null) //Cache is empty {...

Can anyone suggest some good tutorials for Lucene?

Hello friends, can anyone suggest me some good tutorials on Lucene. I was reading Lucene in Action, but it seems to be a old edition of current lucene. Most of the methods are deprecated. Where to start? I am googling around a bit. Thanks, Kapil ...

Does solr make faceting on empty String?

Last time I made a solr index, it started indexing and doing faceting on empty strings too. This never happened. It is the right behaviour? Should I filter empty strings in the DIH? Thanks. ...

Teracotta and Hibernate Search

Does anyone have experience with using Terracotta with Hibernate Search to satisfy application Queries? If so: What magnitude of "object updates" can it handle? (How's the performance) What kind of performance do the Queries have? Is it possible to use Terracotta Hibernate Search without even having a backing Database to sat...

Best practices for seaching for alternate forms of a word with Lucene

I have a site which is searchable using Lucene. I've noticed from logs that users sometimes don't find what they're looking for because they enter a singular term, but only the plural version of that term is used on the site. I would like the search to find uses of other forms of a word as well. This is a problem that I'm sure has bee...

Which field had my search text in Lucene when using a MultiFieldQueryParser?

I'm using Lucene.Net's MultiFieldQueryParser to search multiple fields in my documents. I want to find out which field the text was found. For example, my search might look like this: var parser = new MultiFieldQueryParser(new string[] {"question","answer"}, analyzer); var query = parser.Parse(searchphrase); for(int idx=0; idx<hits.Len...

Problem using same instance of indexSearcher for multiple requests

Hi, Am using Lucene API in a .net web application. I want to use the same instance of Indexsearcher for all the requests.Hence am storing indexsearcher instance in http cache. here is my code for the same: if (HttpRuntime.Cache["IndexSearcher"] == null) { searcher = new IndexSearcher(jobIndexFolderP...

Asynchronous Search

I am currently working on building a proof of concept search solution for my company using Lucene and Hibernate Search. I have built individual components which work fine. I am now looking at creating a single API that would allow a user to get search results back from different sources (domain + data). What I would like to achieve is so...

No Hits Found Using Zend Lucene Search

So I've been working on a crawler script to index all the pages on the my site using Zend Lucene search. I've been able to get the script to work but for some reason will not find the other links on the pages. The problem seems to be when the script hits the find method: $hits = $index->find('url:'.$targets[$i]); When I execute the sc...

Using JBoss Cache as directory for Apache Lucene

Has anyone tried to store Lucene index in JBoss Cache? Are there any good implementations of Lucene Directory for it? I found sources only for this one but I can't find any documentation or testimonials on it. Basically what I would like to do is to store Lucene index in JBoss Cache and manipulate it with application written with GridGa...

How do you search zipcodes using Zend Lucene?

I have very simple company index with Zend Lucene using this to create the index: // store company primary key to identify it in the search results $doc->addField(Zend_Search_Lucene_Field::Keyword('pk', $this->getId())); // index company fields $doc->addField(Zend_Search_Lucene_Field::Unstored('zipcode', $this->getZipcode(), 'utf-8'));...

indexing data in Hibernate Search

I just start integrate Hibernate Search with my hibernate application. The data is indexed by using Hibernate Session everytime i start the server. FullTextSession fullTextSession = Search.getFullTextSession(session); Transaction tx = fullTextSession.beginTransaction(); List books = session.createQuery("from Book as book").list(); for...