In Lucene if you had an multiple indexes that covered only one partition each. Why does the same search on different indexes return results with different scores? The results from the different servers match exactly. i.e If I searched for:
Name - John Smith
DOB - 11/11/1934
Partition 0 would return a score of 0.345
Partition 1 would ...
I've had this long term issue in not quite understanding how to implement a decent Lucene sort or ranking. Say I have a list of cities and their populations. If someone searches "new" or "london" I want the list of prefix matches ordered by population, and I have that working with a prefix search and an sort by field reversed, where the...
I've had an app doing prefix searches for a while. Recently the index size was increased and it turned out that some prefixes were too darned numerous for lucene to handle. It kept throwing me a Too Many Clauses error, which was very frustrating as I kept looking at my JARs and confirming that none of the included code actually used a b...
I use solr to search for documents and when trying to search for documents using this query "id:*", I get this query parser exception telling that it cannot parse the query with * or ? as the first character.
HTTP Status 400 - org.apache.lucene.queryParser.ParseException: Cannot parse 'id:*': '*' or '?' not allowed as first character i...
I want to use Lucene (in particular, Lucene.NET) to search for email address domains.
E.g. I want to search for "@gmail.com" to find all emails sent to a gmail address.
Running a Lucene query for "*@gmail.com" results in an error, asterisks cannot be at the start of queries. Running a query for "@gmail.com" doesn't return any matches, ...
Assume that I have a field called price for the documents in Solr and I have that field faceted. I want to get the facets as ranges of values (eg: 0-100, 100-500, 500-1000, etc). How to do it?
I can specify the ranges beforehand, but I also want to know whether it is possible to calculate the ranges (say for 5 values) automatically base...
Was looking to get peoples thoughts on keeping a Lucene index up to date as changes are made to the domain model objects of an application.
The application in question is a Java/J2EE based web app that uses Hibernate. The way I currently have things working is that the Hibernate mapped model objects all implement a common "Indexable"...
What is the best full text search alternative to ms sql? (which works with ms sql)
I'm looking for something similar to Lucene and Lucene.NET but without the .NET and Java requirements. I would also like to find a solution that is usable in commercial applications.
...
We're currently using Lucene 2.1.0 for our site search and we've hit a difficult problem: one of our index fields is being ignored during a targeted search. Here is the code for adding the field to a document in our index:
// Add market_local to index
contactDocument.add(
new Field(
"market_local"
, StringUtils.objec...
Has someone successfully overridden the scoring of documents in a query so that the "relevancy" of a term to the field contents can be determined through one's own function? If so, was it by implementing a CustomScoreQuery and overriding the customScore(int, float, float)? I cannot seem to find a way to build either a custom sort or a cu...
Lucene has quite poor support for Russian language.
RussianAnalyzer (part of lucene-contrib) is of very low quality.
RussianStemmer module for Snowball is even worse. It does not recognize Russian text in Unicode strings, apparently assuming that some bizarre mix of Unicode and KOI8-R must be used instead.
Do you know any better solut...
Is there a known math formula that I can use to estimate the size of a new Lucene index? I know how many fields I want to have indexed, and the size of each field. And, I know how many items will be indexed. So, once these are processed by Lucene, how does it translate into bytes?
...
I've found how to sort query results by a given field in a Lucene.Net index instead of by score; all it takes is a field that is indexed but not tokenized. However, what I haven't been able to figure out is how to sort that field while ignoring stop words such as "a" and "the", so that the following book titles, for example, would sort i...
I've been using the (Java) Highlighter for Lucene (in the Sandbox package) for some time. However, this isn't really very accurate when it comes to matching the correct terms in search results - it works well for simple queries, for example searching for two separate words will highlight both code fragments in the results.
However, it d...
I am looking into mechanisms for better search capabilities against our database. It is currently a huge bottleneck (causing long-lasting queries that are hurting our database performance).
My boss wanted me to look into Solr, but on closer inspection, it seems we actually want some kind of DB integration mechanism with Lucene itself.
...
I'm using lucene in my project.
Here is my question:
should I use lucene to replace the whole search module which has been implemented with sql using a large number of like statement and accurate search by id or sth,
or should I just use lucene in fuzzy search(i mean full text search)?
...
We run full re-indexes every 7 days (i.e. creating the index from scratch) on our Lucene index and incremental indexes every 2 hours or so. Our index has around 700,000 documents and a full index takes around 17 hours (which isn't a problem).
When we do incremental indexes, we only index content that has changed in the past two hours, s...
I'm looking for a way to do query auto-completion/suggestions in Lucene. I've Googled around a bit and played around a bit, but all of the examples I've seen seem to be setting up filters in Solr. We don't use Solr and aren't planning to move to using Solr in the near future, and Solr is obviously just wrapping around Lucene anyway, so I...
What's your preferred method of providing a search facility on a website? Currently I prefer to use Lucene.net over Indexing Service / SQL Server full-text search (as there's nothing to set up server-side), but what other ways are being used out there?
...
I am trying to use Lucene Java 2.3.2 to implement search on a catalog of products. Apart from the regular fields for a product, there is field called 'Category'. A product can fall in multiple categories. Currently, I use FilteredQuery to search for the same search term with every Category to get the number of results per category.
This...