lucene

[LUCENE.NET] How to use a field from Index to delete an entry?

I'm developing a Desktop Search Engine in VB 9 using Lucene.NET I wish to delete and create a new entry for a file that is updated. The Index stores complete file path and the last modified date. doc.Add(New Field("path", filepath, Field.Store.YES, Field.Index.UN_TOKENIZED)) doc.Add(New Field("modified", New FileInfo(filepath).LastWri...

How do I sort Lucene results by field value using a HitCollector?

I'm using the following code to execute a query in Lucene.Net var collector = new GroupingHitCollector(searcher.GetIndexReader()); searcher.Search(myQuery, collector); resultsCount = collector.Hits.Count; How do I sort these search results based on a field? Update Thanks for your answer. I had tried using TopFieldDocCollector but ...

SQL Server 2008 Full Text Search (FTS) versus Lucene.NET

I know there have been questions in the past about SQL 2005 versus Lucene.NET but since 2008 came out and they made a lot of changes to it and was wondering if anyone can give me pros/cons (or link to an article). ...

Does anyone know where decent documentation describing the Lucene index format IN DETAIL on the web is?

I am mainly curious as to the inner workings of the engine itself. I couldnt find anything about the index format itself (IE in detail as though you were going to build your own compatible implementation) and how it works. I have poked through the code, but its a little large to swallow for what must be described somewhere since there ar...

How do I store the lucene index in a database?

This is my sample code: MysqlDataSource dataSource = new MysqlDataSource(); dataSource.setUser("root"); dataSource.setPassword("ncl"); dataSource.setDatabaseName("userdb"); dataSource.setEmulateLocators(true); //This is important because we are dealing with a blob type data field try{ JdbcDirectory jdbcDir = new JdbcDirectory(d...

Is SQL Server's Full Text Search the right tool for searching phrases, not documents?

30 million distinct phrases, not documents, ranging from one word to a 10 word sentence and I need to support word/phrase searching. Basically what where contains(phrase, "'book' or 'stack overflow'") offers. I have an instance of SQL Server 2005 (32 bit, 4 proc, 4gb) going against several full text catalogs and performance is awful for...

What is the proper dependency entry in pom.xml to use the Snowball analyzer with Lucene 2.4.0?

I'm trying to swap in the SnowballAnalyzer for StandardAnalyzer on my Maven 2 project. I'm currently using <dependency> <groupId>org.apache.lucene</groupId> <artifactId>lucene-contrib</artifactId> <version>2.4.0</version> <scope>compile</scope> </dependency> but I keep getting the following erro...

Python file indexing and searching

I have a large set off files (hdf) that I need to enable search for. For Java I would use Lucene for this, as it's a file and document indexing engine. I don't know what the python equivalent would be though. Can anyone recommend which library I should use for indexing a large collection of files for fast search? Or is the prefered way ...

Stop words in "All of the words" feature

Hi, I'm working on "all of these words" feature using Lucene. I'm using StandardAnalyzer without any stop words. When user types in words which contain "the", "and" etc, lucene does not return any result. If i remove the stop words from the input, then lucene gives search results. Am using booleanquery with BooleanClause.Occur.MUST cla...

Fluent NHibernate + Lucene Search (NHibernate.Search)

I'm using Fluent NHibernate and I would like to implement NHibernate.Search with Lucene but I can't find any examples on how to do that with Fluent NHibernate. It appears there are two steps. (According to Castle) Set the Hibernate properties in the configuration: hibernate.search.default.directory_provider hibernate.search.default.i...

Boost factor in MultiFieldQueryParser

Hi, Can I boost different fields in MultiFieldQueryParser with different factors? Also, what is the maximum boost factor value I can assign to a field? Thanks a ton! Ed ...

Best full text search for mysql?

We're currently running MySQL on a LAMP stack and have been looking at implementing a more thorough, full-text search on our site. We've looked at MySQL's own freetext search, but it doesn't seem to cope well with large databases, which makes it far too slow for our needs. Our main requirements are: speed returning results simple upd...

Lucene boost: I need to make it work better

I'm using Lucene to index components with names and types. Some components are more important, thus, get a bigger boost. However, I cannot get my boost to work properly. I sill get some components appear later (get worse score), even though they have a higher boost. Note that the indexing is done on one field only and I've set the boos...

Tips/recommendations for using Lucene

I'm working on a job portal using asp.net 3.5 I've used Lucene for job and resume search functionality. Would like to know tips/recommendations if any with respect to Lucene performance optimization, scalability, etc. Thanks a ton! ...

Showing search documents count under each category

I need to show total documents count for each category in my search results...for example: Rock(1010) Blues(5030) Pop(2209) : : I was reading somewhere that using TopFieldDocCollector is more efficient than HitCollector class. Given my requirement, how do I use TopFieldDocCollector class?or is there any other approach in Lucene? ...

Do you recomend Sql Server for storing and indexing files (pdf, office, etc) ?

I need to storage and index files, like PDF and office files. Currently I'm using Sql Server 2k8 to perform this task using the Full text search with IFilters. My question is: Is this the "best" way? Should I switch, for instance, to Lucene for indexing? Thanks ...

Which are the best alternatives to Lucene?

-edit- The question do not says it all. :) It may run on Unix and it will be used for email searching (Dovecot, Postfix and maildir). Lucene is not a problem, im just analyzing some alternatives. ...

Make lucene treat all terms in a field as a single term.

In my Lucene documents I have a field "company" where the company name is tokenized. I need the tokenization for a certain part of my application. But for this query, I need to be able to create a PrefixQuery over the whole company field. Example: My Brand my brand brahmin farm brahmin farm Regularly querying for "bra" would ret...

Better way to get all fieldnames from a Lucene index?

Currently I get all the fieldnames as follows: //Get all fieldnames string expectedFieldName = string.Empty; IndexReader r = IndexReader.Open(lucenePath); TermEnum te = r.Terms(); List<string> terms = new List<string>(); while (te.Next()) { terms.Add(te.Term().Field()); } terms = terms.Distinct<string>().ToList(); But it would be ...

PHP: Checking if a directory contains a Zend_Search_Lucene index

I am looking for a reliable way to check to see if a directory contains a Zend_Search_Lucene index. Currently, the only way I have managed to work this out is to check the contents of an exception returned to me using the following code: <?php try { $newIndex = Zend_Search_Lucene::open( $luceneDir ); } catch ( Zend_Search_Lucene_Exc...