lucene

Semantic analysis using Solr

I'm considering about adding semantic analysis to my Solr installation, but I don't exactly know where to start. Basically, I'd like Solr to be able to find "similar" words (taken from the body of the indexed documents). For example, if I search for "music", I should be able to query the semantic engine and obtain "rock", "pop", etc. (o...

Lucene foreign chars problem

I'm having some serious issues using Zend_Lucene and foreign characters like åäö. These issues appear both when the index is created and when it's queried. I've tried both iso-8859-1 and utf-8. ISO-8859-1 The query that doesn't work looks like "+_area:skåne". With Zend_Lucene I'm getting no matches, but if I run this query in Luke I ge...

Are there any technologies that help develop website search?

Hi guys, PROBLEM: I need to write an advanced search functionality for a website. All the data is stored in MySQL and I'm using Zend Framework on top. I know that I can write a script that takes the search page and builds an SQL query out of it, but this becomes extremely slow if there's a lot of hits. Then I would have to get down to t...

Good opensource Java Shopping cart frameworks that can be extended to use Lucene and PDFbox

Hi I'm looking for a Java open source shopping cart framework. I have had a look at KonaKart and OfBiz but I'm looking for other examples for comparison. I will need to have the search on the cart use Lucene to search through the PDFs so that keywords from the documents can return results. So I will need to be able to 'hook" into the ...

Lucene "Or Queries"

Hi, I am new in Lucene, I am trying to make a search something like this content="some thext" and (id ="A" or id="B" or id="c") I am really lost with that, could you help me Thank you. ...

Lucene query syntax

Hi, I want to write a Lucene query which is the equivalent of the following SQL where age = 25 and name in ("tom", "dick", "harry") The best I've come up with so far is: (age:25 name:tom) OR (age:25 name:dick) OR (age:25 name:harry) Is there a more succinct way to write this? Thanks, Don ...

Storing words with apostrophe in Lucene index

Hi, I've a company field in Lucene Index. One of the company names indexed is : Moody's When user types in any of the following keywords,I want this company to come up in search results. 1.Moo 2.Mood 3.Moodys 4.Moody's How should I store this index in Lucene and what type of Lucene Query should I use to get this behaviour? Thanks. ...

How to use lucene across multiple websites

I've got four websites that are edited via one CMS (hanging off one of the sites) like this: www.domain1.com www.domain2.com www.domain3.com www.domain4.com www.domain4.com/cms I'll be using Lucene to index the textual content (from database and uploaded documents) of all four sites. The index will have to be available to both the CMS...

How do I add an EdgeNGramTokenFilter to a Compass Query?

I am building some auto-complete functionality using compass and I need to add an EdgeNGramTokenFilter to the compass query but I cannot see how I can add it. Is this possible? ...

Lucene Query Syntax

Hi, I'm trying to use Lucene to query a domain that has the following structure Student 1-------* Attendance *---------1 Course The data in the domain is summarised below Course.name Attendance.mandatory Student.name ------------------------------------------------- cooking N Bob art Y ...

Best practices for implementing a Lucene search in asp.net eCommerce site

I've been tasked with seeting up a search service on an eCommerce site. Currently, it uses full text indexing on sql server, which isn't ideal, as it's slow, and not all that flexible. How would you suggest i approach changing this over to lucene? By that, i mean, how would i initially load all the data into the indexes, and how would i...

How to get the next term out of a Lucene index?

I'm starting from a Lucene index which someone else created. I'd like to find all of the words that follow a given word. I've extracted the term (org.apache.lucene.index.Term) of interest from the index, and I can find the documents which contain that term: segmentTermDocs = segmentReader.termDocs(term); while (segmentTermDocs.next) {...

How can I order the list in LuceneSearch according to number of hits.

Hi All, I am using Lucene Search to get the articles that are matching the search text. Is there any way to get them in ascending order of number of hits in the Article. Example: If my search text is stack and in first Article there are two occurrences of the word stack and in the second Article there are three occurrences of stack the...

How to get list of all search keyword in Lucene?

I need the list of all search keyword(term) i.e. indexed in lucene index. I googled for it. but, i didn't get the solution. Is it possible to get the list of all search term? ...

Question regarding Lucene scoring

I have a question regarding Lucene scoring. I have two documents in the index, one contains "my name" and the other contains "my first name". When I search for the keyword "my name", the second document is listed above the first one. What I want is that if the document contains exact keyword I typed, it should be listed first, then the o...

Lucene Error While Reading binary block : java.io.EOFException

Hi, I am getting java.io.EOFException while reading a binary block from lucene index. I am storing java object as byte-array in lucene index field and reading it when hit occurs. Here is stack trace : Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2281) at java.io....

Relevancy of Lucene search results

Hi, I've following 3 Documents in Lucene index. As MBA you will play an integral role in implementing the strategy of the business and will have the responsibilities of the statutory accounts, compliance, audit including banking relationships, tax, treasury & cash management As M.B.A. you will play an integral role in implementing the...

Lucene .NET result subsets

I am using Lucene .NET Let's say I want to only return 50 results starting at result 100, how might I go about that? I've searched the docs but am not finding anything. Is there something I'm missing? ...

Zend_Search_Lucene Help

EDIT: I have managed to solve the problem by using: +"lorem ipsum" +type:photo +"lorem ipsum" +type:video Another problem though is that the index is returning correct results but with wrong id (id is a primary key). More specifically, id fields returned are 1 less than real ids (id - 1) in the database which I use to build the index...

Synonyms using Lucene

Hi, What is the best way to handle synonyms (phrases) using Lucene? Especially, when I need to execute queries like :a OR b OR c NOT d How about adding a new field called "synonyms" to each document while indexing? This field's value would have a list of all synonyms. It would be added to a document only when that document has any of ...