views:

189

answers:

2

We currently have in production SQL Server 2005 and we use it's full text search for a eCommerce site search of a million product database. I've optimized it as much as possible (I think) and we're still seeing search times of five seconds.

(We don't need site scrawl or PDF (etc.) document indexing features... JUST "Google" speed for site search.)

I was going to buy dtSearch but now I realize I can just use Lucene.net and save the $2,500 for two server license.

I read on a post that Lucene.Net is not good for website searches.

Has anyone else used Lucene.Net from ASP.Net? Does it take a lot of memory?

Any problems?

Any comments?

A: 

We've been using Lucene for ages and it's worked really well for us. We do have databases with > 1M entries and Lucene queries return in a couple of milliseconds.

For us, we have a slight disadvantage in that new entries can be added to the database at any time, and switching between indexing and querying can be relatively slow (so the first search after updating the index takes maybe 400ms instead of the usual 5ms). But for a product website where you can do batch updates, you should be golden.

The other drawback of Lucene is that the index files can only be accessed by one process at a time. If you have multiple web servers, that means you need to run Lucene in a separate process. For us, we just have a service running on our database cluster (so it has failover if one fails) which our web servers connect to via a simple sockets interface to perform querying.

Dean Harding
Oh, one more thing. We actually use the Java Lucene, even though we're a .NET shop as well. We tried Lucene.Net, but it was several versions behind the Java version at the time. It's mostly caught now, though, I believe.
Dean Harding
"The other drawback of Lucene is that the index files can only be accessed by one process at a time." This isn't true, you can have many applications / processes / threads reading at any time. You can even write data and still query it. But there should only be one person writing at a time.
Andrew Smith
+1  A: 

Another option is Solr, which is based on Lucene so it's also very fast but is easier to set up and use, however it runs as a separate Java process.

Mauricio Scheffer