lucene

How to do partial word searches in Lucene.NET?

I have a relatively small index containing around 4,000 locations. Among other things, I'm using it to populate an autocomplete field on a search form. My index contains documents with a Location field containing values like Ohio Dayton, Ohio Dublin, Ohio Columbus, Ohio I want to be able to type in "ohi" and have all of these re...

mahout lucene document clustering howto?

I'm reading that i can create mahout vectors from a lucene index that can be used to apply the mahout clustering algorithms. http://cwiki.apache.org/confluence/display/MAHOUT/Creating+Vectors+from+Text I would like to apply K-means clustering algorithm in the documents in my Lucene index, but it is not clear how can i apply this algorit...

How to call a Zend lucene search function?

I inherited a Zend project devoid of comments and I didn't get to talk to the previous developer. Since I have no Zend experience I'm having some issues :) I'd like to print out some variables inside an function that indexes items from the site using Zend_Search_Lucene because I think something is going wrong here. From what I've read...

Pylucene eclipse plugin

Is there a Pylucene eclipse plugin? or am I missing something? I want it for Auto complete. Is the import structure same as java lucene ...

How do you boost term relevance in Sql Server Full Text Search like you can in Lucene?

I'm doing a typical full text search using containstable using 'ISABOUT(term1,term2,term3)' and although it supports term weighting that's not what I need. I need the ability to boost the relevancy of terms contained in certain portions of text. For example, it is customary for metatags or page title to be weighted differently than bod...

get similarity score between two document termfreqvectors

Hi I would like to extract similarity score between two document termfreqvectors. I checked that if i submit the first one as a query and look the second in the result set, I cannot have the precise score that lucene gives for these two vectors? any help? ...

Lucene .NET 2.3.2 Security Exception - Medium trust Issues

I'm only partially able to get Lucene .NET to work on GoDaddy. It throws a security exception on this line: Hits hits = searcher.Search(query, filter); Here are the details of this exception: Description: The application attempted to perform an operation not allowed by the security policy. To grant this application the required perm...

Cross Referencing Databases on Fuzzy Data

I am currently working on project where I have to match up a large quantity of user-generated names with a separate list of the same names in a canonical format. The problem is that the user-generated names contains numerous misspellings, abbreviations, as well as simply invalid data, making it hard to do a cross-reference with the canon...

Filters in Lucene

Friends, I am new to lucene full text search. i have developed page with full text seach. it works fine till. but now i want to add extra condition like where clause. how to do it. The requirement given for me is, i have to list proposal which is created by logged in user. I have to add this condition in back end without user knowledg...

Best Practices for implementing a Lucene Search in Java

Each document in my Lucene index is kind of similar to a post in stackoverflow and I am trying to search through the index (which contains millions of documents). Each user should only be able to search through the user's company posts only. I have no control over how the data is indexed and I only need to implement a simple search (tha...

Grails searchable plugin

Hi, In my Grails app, I have the following domain class that is indexed by the Searchable plugin: class Foo implements Serializable { BookletCategory bookletCategory Date lastUpdated static hasMany = [details: BookletRootDetail] static searchable = { bookletCategory component: true id name: 'bookletRoo...

Lucene searching by numeric values

I'm building a Java Lucene-based search system that, on addition, adds a certain number of meta-fields, one of which is a sourceId field, which denotes where the entry came from. I'm now trying to retrieve all documents from a particular source, but the index doesn't appear to be able to find them. However, if I search for a wildcard va...

how to import mysql tables to SOLR

i can never understand how solr works. it just talks about schema files all the way but how do i import content from the database to it with a painless method? i have tried to figure it out by reading their tutorials but it just mess up my head. its written for the Einsteins out there cause apparently there are a lot of people who als...

Cheat sheets for Lucene/Solr?

Is there any cheat sheet out there for Lucene/Solr query parameters, schema.xml elements (all the analyzers, tokenizers, etc.)? Or somewhere else I can find ALL query parameters? I cant find any with Google. ...

How can I index HTML documents?

I am using Lucene .NEt to do full-text searching. Till now I have been indexing PDF docs, but now I have a few webpages that I need to index. What's the best/easiest way to index HTML documents to add to my Lucene index? I am using .NET/C# ...

Get search word Hits ( number of occurences) per document in Lucene

Hi, Can any one suggest me the best way to get Hits( no of occurrences ) of a word per document in Lucene?.. ...

How to programmatically add an index column for NHibernate Search (Lucene.net) without using FieldAttribute

I'm trying to find out how to programmatically (i.e. without using the FieldAttribute) add an index column for NHibernate Search (Lucene.net). I'm having inheritance issues due to the fact that the FieldAttribute is not automatically inherited. The following code illustrates what I want to do. class A { [Field(Index.Tokenized)] ...

zend lucene problem with the word "mortgage"

I'm using Porter Stemmer to stem the words, and here's a problem I'm running into: Word "mortgage" is correctly stemmed to "mortgag" Word "mortgagee" is (arguably incorrectly) stemmed to "mortgage" There are approximately 100 documents with the word "mortgage" There is 1 document with word "mortgagee" When I build an index without put...

Adding custom Analyzers to Luke

Luke, the wonderful Lucene index viewer, is now hosted under Google code. As a default, it supports using several Lucene Analyzers out of the box. However, I would like to use it to view an index I built using my own custom Analyzer, Let's call it MyAnalyzer. Can you please tell me how to add MyAnalyzer to Luke, along with the default an...

Lucene Query WITHOUT Operators

I am trying to use Lucene to search for names in a database. However, some of the names contain words like "NOT" and "OR" and even "-" minus symbols. I still want the different tokens inside the names to be broken up using an Analyzer and searched upon as a boolean combination of terms, but I do not want Lucene to interpret any of the "N...