views:

61

answers:

2

Hi all,

we have a eMall application based mainly around a ~500k rows MySQL master table (with detail tables storing non searchable fields and other related tables with shop info etc).

Users can today search based on specific structured product data (e.g. brand, category, price, specific shop etc).

We would also like to support keyword search in combination with the structured data.

We also want to improve the performance of our application and are considering our infrastructure options to achieve both the functional requirement of keyword search and the technical requirement of improved speed:

Lucene, Sphinx etc to index all products? A NoSQL db (mongo, couch etc) used as an intermediate cache layer in front of MySQL? A NOSQL db to replace MySQL?

A combination of the above?

In the case of Lucene and Sphinx - how flexible are they in terms of combining structured criteria? Or would we need to first run a text search and then filter the results with a second structured query on mySQL?

Any hints or leasons learned from your own experiences would be more than welcome!

thanks in advance

+3  A: 

I suggest you use Solr - It enables keyword search, based on Lucene. You can use facets and filters for your structured product data. 500 K items seem a size Solr can handle rather easily. It can be considered a NoSQL DB, and it is easier to use than pure Lucene. You can go over the relevant considerations in Full Text Search Engine versus DBMS.

Yuval F
See this link to get know more about solr capabilities - http://www.ibm.com/developerworks/java/library/j-solr1/ and lucene query language: http://lucene.apache.org/java/3_0_2/queryparsersyntax.html
Skarab
+1 for Solr. You can get all that you want done and then some. My only suggestion keep the moving parts to minimum. Just don't add Nosql-ish things because they are the "in" thing. FWIW Solr should plenty suffice.
Mikos
Many thanks for your response. I guess we will have to experiment with Solr and Sphinx. We will give Sphinx a try first since it seems to be easier to integrate with what we already have but Solr might also be a suitable solution.
webgr
+1  A: 

I have been using Sphinx for full text searching similar to your requirements (searching based on freetext and structured attributes), with a few GB of data & 5M rows in MySQL. I am very satisfied with the performance and reliability (not even a single downtime).

The advantage in using Sphinx is it is targeted to use with MySQL, so it is really easy to setup. Normally you can have the whole system ready in less than an hour, so why not give it a try?

tszming