views:

597

answers:

7

In one of our commercial applications (Win32, written in Delphi) we'd like to implement full text search. The application is storing user data in some kind of binary format that is not directly recognizable as a text.

Ideally, I'd like to find either an in-process solution (DLL would be OK) or a local server that I could access via TCP (preferably). The API should allow me to submit a textual information to the server (along with the metadata representing the binary blob it came from) and, of course, it should allow me to do a full-text search with at least minimal support for logical operators and substring searching. Unicode support is required.

I found extensive list of search engines on Stack Overflow (What are some Search Servers out there?) but I don't really understand which of those engines could satisfy my needs. I thought of asking The Collective for opinion before I spend a day or two testing each of them.

Any suggestions?

A: 

I'd recommend having a look at SQLite -- full-text search is included in the latest version.

Allen
A: 

I suppose the answer depends on your db. For example SQL Server has full text search and also English Language Queries if ever needed.

Thomas Wagner
A: 

Take a look at using PostgreSQL and tsearch.

A: 

Try using postgresql with tsearch

+1  A: 

There are a number of options on the market. Either fully fledge commercial products or open source variants. Your choice of a search provider is very dependent on the customers you are targetting.

Microsoft has a free Express version of their Search Server. As far as I know the Express edition is limited to running the Application Tier on one server.

There is also the Apache Lucene project which is open source. It has a nice API that's easy to use and a large community of users. The original project is based on Java, but there are also other implementations such as NLucene for .NET that I have used personally.

Kimoz
A: 

Sphinx is probably the most efficient and scalable option while SQLite - FTS3 is the most straightforward option.

Seun Osewa
A: 

While not in-process, Solr is very fast (based on Lucene) and easily accessible from any platform (HTTP)

Mauricio Scheffer