Good day,
If I have for example the documents which have the following fields
Person_name - Birthday
Jordan - 2009-06-15
Marc - 2009-01-01
Marcos - 2009-01-01
Marcissh_something_something - 2009-06-15
Marcos - 2009-12-31
And upon searching for Person_name:Marc* I got the following scores (scores here are hypothetical)
Person_name - ...
Background
I am assuming the following code is completely thread safe:
// Called from a servlet when a user action results in the index needing to be updated
public static void rebuildIndex() {
FSDirectory dir = new NIOFSDirectory(new File(Configuration.getAttachmentFolder()), null);
IndexWriter w = new IndexWriter(dir, analyzer, Index...
Does anybody use Katta with Java? Are any samples avalible?
...
I have an application that uses lucene for searching. The search space are in the thousands. Searching against these thousands, I get only a few results, around 20 (which is ok and expected).
However, when I reduce my search space to just those 20 entries (i.e. I indexed only those 20 entries and disregard everything else...so that deve...
I want to learn Solr.May i know some good tutorial/links for the same?
Is Solr available for .net too?
...
I am using the Java-based Nutch web-search software. In order to prevent duplicate (url) results from being returned in my search query results, I am trying to remove (a.k.a. normalize) the expressions of 'jsessionid' from the urls being indexed when running the Nutch crawler to index my intranet. However my modifications to $NUTCH_HOME/...
Does Apaches Solr search engine provide approximate string matches, e.g. via Levenshtein algorithm?
I'm looking for a way to find customers by last name. But I cannot guarantee the correctness of the names. How can I configure SOLR so that it would find the person
"Levenshtein" even if I search for "Levenstein" ?
...
I am using Lucene to index and search a small number of large documents. Using the demo from the Lucene site I have indexed the documents and am able to search them. However, the search result is not particularly useful as it points to the file of the document. With very large documents this isn't particularly useful.
I am wondering if ...
Hi all,
I am having difficulty determining my misunderstanding of how Zend Search Lucene indexes and searches integers in ranges.
In the following example, I would expect the output to be 1, however it is always 2 (both results). Any hints would be much appreciated.
<?php
require_once 'Zend/Loader/Autoloader.php';
$loader = Zend_Load...
I'm writing a phonebook search, that will query multiple remote sources but I'm wondering how it's best to approach this task.
The easiest way to do this is to take the query, start a thread per remote source query (limiting max results to say 10), waiting for the results from all threads and aggregating the list into a total of 10 entr...
I have a social network set up and via an api I want to search the entries. The database of the social network is mysql. I want the search to return results in the following format: Results that match the query AND are friends of the user performing the search should be prioritized over results that simply match the query.
So can this...
In my project we use Lucene 2.4.1 for fulltext search. This is a J2EE project, IndexSearcher is created once. In the background, the index is refreshed every couple of minutes (when the content changes). Users can search the index through a search mechanism on the page.
The problem is, the results returned by Lucene seem to be cached so...
Hi,
I'm using the searchable plugin for Grails (which provides an API for Compass, which is itself an API over Lucene). I have an Order class that I would like to search but, I don't want to search all the instances of Order, just a subset of them. Something like this:
// This is a Hibernate/GORM call
List<Order> searchableOrders = Cus...
Hi all,
We have set up an Solr index containing 36 million documents (~1K-2K each) and we try to query a maximum of 100 documents matching a single simple keyword. This works pretty fast as we had hoped for.
However, if we now add "&sort=createDate+desc" to the query (thus asking for the top 100 'new' documents matching the query) it run...
Hi,
In my Grails app, I'm using the Searchable plugin for searching/indexing. I want to write a Compass/Lucene query that involves multiple domain classes. Within that query when I want to refer to the id of a class, I can't simply use 'id' because all classes have an 'id' property. Currently, I work around this problem by adding the fo...
Hi,
I'm trying to analyze content of a Drupal database for collective intelligence purposes.
So far I've been able to work out a simple example that tokenizes the various contents (mainly forum posts) and count tokens after removing stop words.
The StandardTokenizer supplied with Lucene should be able to tokenize hostnames and emails b...
I am trying to get SpellChecker setup using Lucene.NET, it all works fine other than situations similar to the following:
I have text containing satellite in the index, I analyze it using Snowball.
I then create a SpellChecker index and get suggestions from it. The suggestion I get returned when passing in "Satalite" is "satellit".
I...
Does know how to make Lucene .NET 2.3.2 run in a medium trust environment? GoDaddy doesn't like it the way it is.
...
I'm having problems getting a simple URL to tokenize properly so that you can search it as expected.
I'm indexing "http://news.bbc.co.uk/sport1/hi/football/internationals/8196322.stm" with the StandardAnalyzer and it is tokenizing the string as the following (debug output):
(http,0,4,type=<ALPHANUM>)
(news.bbc.co.uk,7,21,type=<HOST>)
(...
Hi
i have built an index in Lucene. I want without specifying a query, just to get a score (cosine similarity or another distance?) between two documents in the index.
For example i am getting from previously opened IndexReader ir the documents with ids 2 and 4.
Document d1 = ir.document(2);
Document d2 = ir.document(4);
How can i ge...