lucene

In a Lucene / Lucene.net search, how do I count the number of hits per document?

When searching a bunch of documents, I can easily find the number of documents which match my search criteria: Hits hits = Searcher.Search(query); int DocumentCount = hits.Length(); How do I determine the total number of hits within the documents? For example, let's say I search for "congress" and I get 2 documents back. How can I get...

Concurency in Lucene.NET.

I want to use Lucene.NET for fulltext search shared between two apps: one is an ASP.NET MVC application and the other one is a console application. Both applications are supposed to search and update index. How the concurrency should be handled? I found a tutorial on ifdefined.com where the similar use case is discussed. My concern is ...

Counting sentences: Database (like h2) vs. Lucene vs. ?

Hi all, I am doing some linguistic research that depends on being able to query a corpus of 100 million sentences. The information I need from that corpus is along the lines: how many sentences had "john" as first word, "went" as second word and "hospital" as the fifth word...etc So I just need the count and don't need to actually retri...

ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage ?

I'm currently looking at other search methods rather than having a huge SQL query. I saw elasticsearch recently and played with woosh (a python implementation of a search engine). Can you argument your choices on why you chose or will choose any of those project ? ...

Hibernate Search - searching in given scope.

Hi, Let's say I have following classes. (only most important things included) public class Client { /* Some Properties */ } public class ClientDocumentAssociation { @ManyToOne private Client client; /* Some Properties */ } @Indexed public class Document { @OneToOne private ClientDocumentAssociation clientAsso...

Combining Lucene's WildcardQuery with FuzzyQuery

Using Lucene.Net 2.4.0 is there some kind of built-in support for joining the results of two different queries that target the same index, similar to the support for targeting two or more indexes with a single query? I'm looking for ways to support both trailing wildcard and fuzzy searches without forcing users to choose one or the oth...

Indexing PDF files with Symfony using Lucene

I am a Symfony developer and my web server is Linux. I already use the sfLucene plugin. What is the simplest way of indexing PDF files for search on a Linux PHP server? XPDF, installed like this Apache Tika via the SOLR sfLucene plugin branch A 3rd option? Thanks! ...

How would one use Lucene to help implement search on a site like StackOverflow?

I've asked a simlar question on Meta StackOverflow, but that deals specifically with whether or not Lucene.NET is used on StackOverflow. The purpose of the question here is more of a hypotetical, as to what approaches one would make if they were to use Lucene.NET as a basis for in-site search and other factors in a site like StackOverfl...

Lucene.Net search "Local results" based on geographic location

Good afternoon, I'm looking for some info so that users can find local results in their Lucene.Net searches. I would index the Latitude / Longitude in the document, and query Lucene based on the users latitude/longitude and 20 (or 30, 40...) mile range. Any help would be appreciated. ...

Hibernate Search Paging + FullTextSearch + Criteria

I am trying to do a search with some criteria FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(finalQuery, KnowledgeBaseSolution.class).setCriteriaQuery(criteria); and then page it //Gives me around 700 results result.setResultCount(fullTextQuery.getResultSize()); //Some pages are empty fullTextQuery.setFirstResult(...

HTTP ERROR: 500 Severe errors in solr configuration.

hi i am trying to import data from mysql following this link http://www.cabotsolutions.com/blog/200905/using-solr-lucene-for-full-text-search-with-mysql/ I am getting the following error.. HTTP ERROR: 500 Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to c...

Is there a open-search solution for python?

lucene-like would be preferred. thanks ...

Better search results using Lucene

I've got a database with a lot of books in it. I've got fields like title, descriptions, authors etc. I'm indexing title with a boost of 100f and description with a boost of 0.1f, both fields tokenized and stemmed. I'm searching with a single input field, that searches in all available fields using a booleanquery joined with BooleanCla...

Is it possible to iterate through documents stored in Lucene Index?

I have some documents stored in a Lucene index with a docId field. I want to get all docIds stored in the index. There is also a problem. Number of documents is about 300 000 so I would prefer to get this docIds in chunks of size 500. Is it possible to do so? ...

Solr/Lucene: Indexing facet values

For example, say I have the following facet: Colors Red (7825) Orange (2343) Green (843) Blue (5412) In my database, colors would be a table and each color would have a primary key and a name/value. When indexing with Solr/Lucene, in all of the examples I've seen, the value is indexed and not the primary key. So if I filter by the ...

Spelling correction for data normalization in Java

I am looking for a Java library to do some initial spell checking / data normalization on user generated text content, imagine the interests entered in a Facebook profile. This text will be tokenized at some point (before or after spell correction, whatever works better) and some of it used as keys to search for (exact match). It would ...

Zend_Search_Lucene failing to return documents

Hi, I am struggling with a bug/problem that I am having trouble with when using Zend_Search_Lucene. Now I have 2 indexes that I search one that is parsed html pages/text that I use the Zend_Search_Lucene_Document_Html::loadHTML() function to read the contents and add to one of the lucene indexes. The other index I manually create a lu...

Memory leak during repeated lucene query searches?

Hi all, Basically, I simply want to do many searches on a given lucene index. Therefore, I made a class Data with final 'analyzer', 'reader', 'searcher' and 'parser' fields, (all properly initialized in the constructor). The class also provides a 'search' method to search the index. This is all shown in the code below. The problem is...

lucene BooleanQuery problem

Hi, I am searching in lucene with a "equals" operator implemented like: return new TermQuery(new Term(getName(), getValue())); for a vale like: customerID:YADA-UT-08ec5de9-8813-4361-be88-55695ddfaa00 This is working. BUT, if i use an "in" operator implemented with a BooleanQuery like; final BooleanQuery booleanQuery = new BooleanQ...

Asp.MVC and nHibernate and Lucene question

Hi, I have an Asp.Net MVC app and I am looking into implementing a search engine that will search for individuals. I would like to use Nhibernate Search & Lucene.Net as this will keep the index in sync when an individual is inserted or updated, resulting in changes being visible when a user runs a search. The issue I have is what if m...