lucene

Lucene crawler (it needs to build lucene index)

Hi, I am looking for Apache Lucene web crawler written in java if possible or in any other language. The crawler must use lucene and create a valid lucene index and document files, so this is the reason why nutch is eliminated for example... Does anybody know does such a web crawler exist and can If answer is yes where I can find it. T...

Need to know pros and cons of using RAMDirectory

Hi, I need to improve performance of my Lucene search query. Can I use RAMDirectory?Does it optimize performance?Is there any index size limit for this? I would appreciate if someone could list pros and cons of using a RAMDirectory. Thanks. ...

Lucene.Net TermQuery wildcard search

Hi, I have a lucene index I am trying to do a wildcard search. In index i have a character like '234Test2343' I am trying to do the search like %Test%.. My lucene syntax looks like string catalogNumber="test"; Term searchTerm = new Term("FIELD", "*"+catalogNumber+"*"); Query query = new TermQuery(searchTerm); I don't get the results ...

Handling + as a special character in Lucene search

Hi, How do i make sure lucene gives me back relevant search results when my input string contains terms like c++? Lucene seems to ignore ++ characters. Code details: When I execute this line,I get a blank search query. queryField = multiFieldQueryParser.Parse(inpKeywords); keywordsQuery.Add(queryField, BooleanClause.Occur.SHOULD); ...

Search filters with Lucene.NET

I'm using Lucene.Net to create a website to search books, articles, etc, stored as PDFs. I need to be able to filter my search results based on author name, for example. Can this be done with just Lucene? Or do I need a DB to store the filter fields for each document? Also, what's the best way to index my documents? I'll have about 5...

Resolving Lucene Index error

Hi, Why do I get error like this in Lucene and how to resolve it? Could not find file 'C:\Indexes_z3_1.del'. Thanks. ...

Lucene Index and Query Design Question - Searching People

I have recently just started working with Lucene (specifically, Lucene.Net) and have successfully created several indicies and have no problem with any of them. Previously having worked with Endeca, I find that Lucene is lightweight, powerful, and has a much lower learning curve (due mostly to a concise API). However, I have one specif...

What lucene analyzer can be used to handle Japanese text?

Good day, Which lucene analyzer can be used to handle Japanese text properly? It should be able to handle Kanji, Hiragana, Katakana, Romaji, and any of their combination. Thanks, Franz ...

Linq to Lucene error: "Classes must define at least one field as a default search field"

I have the following attributes applied to my linq to sql class: [Document(MetadataType = typeof(SomeObjectMetadata))] public partial class SomeObject { } And this is the metadata code: public class SomeObjectMetadata { [Field(FieldIndex.Tokenized, FieldStore.Yes, IsKey = true)] private object ProductId { get; set; } ...

Inflectional forms of verbs using DBsight lucene?

I know dbsight allows synonyms and stop words for searching but does this take care of inflectional forms of a verb too e.g. for 'swim' it should find swim, swims, swimming, swam, and swum Link on DBSight Wiki : http://wiki.dbsight.com/index.php?title=User%5Fdictionary ...

Searching on date ranges with Lucene in Java ?

Is it possible to search on date ranges using Lucene in Java? How do I build Lucene search queries based on date fields and dates ranges? For example: between specified dates prior to a specified date after a specified date within the last 24 hours within the past week within the past month. [Edit] i'm using Lucene 2.4.1 and my syste...

Boosting Multi-Value Fields

I have a set of documents containing scored items that I'd like to index. Our data structure looks like: Document ID Text List<RelatedScore> RelatedScore ID Score My first thought was to add each RelatedScore as a multi-value field using the Boost property of the Field to modify the value of the particular score when searc...

Using Zend Lucene to search Office 2003 or older files

I know there are already objects supporting Office 2007 files, but is there any native Office 2003 or earlier support ? ...

Lucene.NET: Best way to process keyword snippets from document text

I'm using Lucene.Net to implement a search website (to search PDFs). Once the keyword is entered, I display the results, and when one of the result items is clicked, I want to take the user to a "details" page, where I want to display snippets from that PDF document everywhere the keyword is found. So my question is, what's the best way...

Lucene trouble with indexing recurring events

Hi, I'm trying to come up with a way to query dates in Lucene. Basically I have an event that has a start date, end date and can also occur regularly. The way I tried to go about it was to create an index field in Lucene that would list all the possible dates separated by a comma (or empty space would be enough, really) and than apply ra...

Adding fuzziness to a lucene query

Is there a simple way to add a fuzziness level to a user entered search query in lucene, I'd like to avoid having to parse their entered text if possible. At present if they enter green boxes I use a multifield query parser with boosts which easily generates the following for example: +(title:green^10 title:boxes^10) +(category:green...

Symfony and Lucene

SOLVED See my answer below. Question left unchanged for anyone else who has trouble with this. I'd like to use lucene (or anything else that could be used withs symfony for searching really) however I can't get the sfLucene plugin to work (says there are no tasks in the namespace "lucene" when i do ./symfony lucene:initialize). What ...

Problem with Lucene scoring

I have a problem with Lucene's scoring function that I can't figure out. So far, I've been able to write this code to reproduce it. package lucenebug; import java.util.Arrays; import java.util.List; import org.apache.lucene.analysis.SimpleAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; im...

indexing services for websites: lucene, and other options

Hey guys Just looking for some search and indexing services for our sites, and wondered if you guys could recommend anything? our requirements: The service can either index via http, or direct access to our database. It's gotta be just really simple to use, and set up provide a simple API so we can get the results programmatically an...

How far behind the original is Lucene.Net?

I've noticed that Lucene recently released v2.9 (on 25th September this year - 2009), whereas Lucene.Net appears to be v2.0 (released back in 2007): Does the v2.0 of Lucene.net correspond to the features found in v2.0 of the original Apache Lucene Are the improvements made in Apache Lucene since 2007 significant enough to warrant consi...