views:

97

answers:

2

Hi all..

I know that searching in itself is a huge field, which also seems to be my problem. I just don't know where to start.

Here's the scenario: I got a lot of shops which employs a lot of people and which has a lot of different products for sale. I need to create a single-textfield searchoption for my site. This textfield should search through metadata of the shops (address, name, etc.), metadata of the products (name, type, etc.) and through employees.

I recon that this search option will be widely used on the site, so it needs to perform very well (no lookups in each column, string.Contains() etc.).

Currently I'm thinking a long the lines of having tags for each entity in the database, which points to the shop / employee / product, but I'm unsure if this kind of data-duplication would be easy to maintain. Another option would be to cache popular queuries, however, I don't think any queuries will be that much popular over others.

I'm sure I'm not the first to have this kind of problem. Are there any common strategies, 3rd party libs, 3rd parties sites or just hints for me for doing this?

Thanks in advance :)

+3  A: 

Have you looked at Lucene? It's open-source and the Java version is mature. It could greatly help you index and search your organized data. I believe there are versions for .Net, C (with Perl/Ruby bindings), and Python.

Regarding documentation - for the Java version at least there are some books out there including this one.

I have had good experiences using it to index relational data stored in a database so that it can be searched in an adhoc manner. However, it can also crawl and index file based data.

teabot
Looks really interesting :O Documentation seems a bit poor though, but I'll definatively look into it :)
cwap
Lucene is pretty powerful but yes the documentation is a bit shoddy. There should be a pretty big community following for it though as it's used in many applications.
Brian
+1  A: 

You are really looking for a full-text search engine. Lucene, as recommended by teabot is a great tool for this.

A project based on Lucene (and for sale of course) is DBSight, which provides tools and interfaces for administrating the look/format of search results, as well as setting up the indexes in Lucene based on the results of database queries you define yourself.

I feel like a shill now. But I've tried it and it works quite well.

Jeff Meatball Yang