views:

931

answers:

2

With stackoveflow.com in perspective (team of 2-3 engineers building a website project intended to scale) does it make sense to spend effort early in the process of development to build a search based on Lucene/Autonomy… as opposed to a database based full text search.

Pros/Cons:
With a mature Lucene implementation like nutch or autonomy, the cost of moving to Lucene (which is inevitable) at a later stage is negligible.
In large volumes adding additional index servers (say with nutch) to maintain the growing search index is relatively easy.
With a Lucene implementation I’ll mostly likely need an additional server to main the in-memory index (much early in the process of scaling).

+3  A: 

Database fulltext search performance varies from database to database, but it's by far the easiest option to setup. So start with that, and move to lucene or sphinx if it proves to be too slow.

Seun Osewa
if your db's full text search is good enough, use it (unless you have an exotic requirement such as db-independence).
alex
A: 

You should keep it isolated though - don't start throwing SELECTS all over your code if you know you will replace them with a search engine query. Wrap your DB's full text search with a thin abstraction layer that makes sure you don't use database capabilities where you shouldn't.

I second the accepted answer though - premature optimization here is definitely evil.

ripper234