lucene-index

Error while copying Lucene index

I've an asp.net web application which uses Lucene API for search. Here is the problem scenario: Events: User invokes a Lucene search query thru the web application. There is another windows service running which just copies the search index folder to another folder. When event 2 occurs after event 1 has occurred, I am getting error ...

Lucene: unstored fields.

Hello, I just wondering whenever exist a way to read the unstored, but indexed field in Lucene index? I need because I have an index and I'm going to iterate over all documents in the index in order to apply some analysis and I need to update those documents later, in order to update I need first delete and when to re-insert the docume...

Why does 'delete document' in lucene 2.4 not work?

Hi I want to delete a document in lucene 2.4 with java. My code is Directory directory = FSDirectory.getDirectory("c:/index"); IndexReader indexReader = IndexReader.open(directory); System.out.println("num="+indexReader.maxDoc()); indexReader.deleteDocuments(new Term("name","1")); System.out.println("num="+indexReader.maxDoc()...

zend lucene problem with the word "mortgage"

I'm using Porter Stemmer to stem the words, and here's a problem I'm running into: Word "mortgage" is correctly stemmed to "mortgag" Word "mortgagee" is (arguably incorrectly) stemmed to "mortgage" There are approximately 100 documents with the word "mortgage" There is 1 document with word "mortgagee" When I build an index without put...

use compass-lucene as caching technique

Any example of scenarios other than doing search for which I could use "compass"? Lets say we have a page that list top 10 most view article. How to use compass to show this kind of results. Any demo/sample project on this to refer to? definitely Jira would be a good example but its source code is not available. I want to know how to ma...

Will Lucene produce smaller files sizes if you index short one character Field names?

Does Lucene produce smaller files sizes if you index short one character Field names? For instance, will using "d" instead of "description" produce significantly smaller indexing files on disk? Or does it map to shorter internal IDs anyway? :) ...

Lucene - querying with long strings

I have an index, with a field "Affiliation", some example values are: "Stanford University School of Medicine, Palo Alto, CA USA", "Institute of Neurobiology, School of Medicine, Stanford University, Palo Alto, CA", "School of Medicine, Harvard University, Boston MA", "Brigham & Women's, Harvard University School of Medicine, Boston, M...

OutOfMemoryError: Java heap space error when start solr

Hi I start indexing DB articles with solr, but after add about 58 million article (and about 113 GB size of disk) , i get below error message on tomcat log error Note1: i already set Init memory pool to 256MB, and Max memory pool:1400MB to tomcat server. Note2: I can post or search article but must wait over 3 min for get response. ...

Exception when indexing text documents with Lucene, using SnowballAnalyzer for cleaning up

Hello!!! I am indexing the documents with Lucene and am trying to apply the SnowballAnalyzer for punctuation and stopword removal from text .. I keep getting the following error :( IllegalAccessError: tried to access method org.apache.lucene.analysis.Tokenizer.(Ljava/io/Reader;)V from class org.apache.lucene.analysis.snowball.Snowba...

Reading from compressed lucene index

I created a lucene index and compressed the index directory with bz2 or zip. I donot want to uncompress it. Is there any API call that can read the index from this zipped directory and thus allow searching and other functionalities. That is, can lucence IndexReader read the index from a compressed file. I saw that Lucnene IndexReader ...

Refining Solr searches, getting exact matches?

Afternoon chaps, Right, I'm constructing a fairly complex (to me anyway) search system for a website using Solr, although this question is quite simple I think... I have two search criteria, location and type. I want to return results that are exact matches to type (letter to letter, no exceptions), and like location. My current sea...

Multiple or single index in Lucene?

I have to index different kinds of data (text documents, forum messages, user profile data, etc) that should be searched together (ie, a single search would return results of the different kinds of data). What are the advantages and disadvantages of having multiple indexes, one for each type of data? And the advantages and disadvantage...

"no inclosing instance error " while getting top term frequencies for document from Lucene index

Hello ! I am trying to get the most occurring term frequencies for every particular document in Lucene index. I am trying to set the treshold of top occuring terms that I care about, maybe 20 However, I am getting the "no inclosing instance of type DisplayTermVectors is accessible" when calling Comparator... So to this function I p...

updating lucene index

What is the best way to update an existing Lucene index. I dont't just have to add/delete documents from it, rather update the existing documents. ...

Using Lucene to Query File properties in Windows

Hi All, I am planning to use Apache lucense in one of my projects, I want to index files based on the file properties (I won’t be indexing the data) and I want lucense to query the index so that I can quickly find list of files to based on the properties . E.g: give me all the files with access time greater than 10/10/2005 and access t...

Denormalizing relational data for lucene/solr

I have an architectural question about using apache solr/lucene. I'm building a solr index for searching a CV database. Basically every cv on there will have some fields like: rate of pay, address, title these fields are straight forward. The area I need advise on is, skills and job history. For skills, someone might add an entry l...

Query types within Lucene

Lucene NOOB alert! I consider myself to be a human of at least reasonable intelligence, however I am having enormous problems mentally grokking the query types within Lucene. In my particular instance I need to search a single string field in my document that is of only moedrate length (avg around 50 chars). I want the user to be able...

Does it make sense to use Hadoop for import operations and Solr to provide a web interface?

I'm looking at the need to import a lot of data in realtime into a Lucene index. This will consist of files of various formats (Doc, Docx, Pdf, etc). The data will be imported as batches compressed files, and so they will need to be decompressed and indexed into an individual file, and somehow relate to the file batch as a whole. I'm ...

Lucene.Net BooleanClause issue

I'm having an issue with Lucene.Net and a BooleanQuery. This is my code: BooleanQuery query = new BooleanQuery(); String[] types = searchTypes.Split(','); foreach (string t in types) query.Add(new TermQuery(new Term("document type", t.ToLower())), BooleanClause.Occur.SHOULD); This should basically be an OR statement going ...

exist xml database - full text index using Lucene

Hi i using Lucene to create index in my xml eXist database. In this time need to get all index (want create tag). I should use xpath query? or need write something in java? ...