views:

368

answers:

5

In my project, entire searching and listing of content is depend on Lucene. I am not facing any performance issues. Still, the project is in development phase and long way to go in production.

I have to find out the performance issues before the project completed in large structure. Whether the excessive use of lucene is feasible or not?

A: 

What would you define as excessive?

If your application has a solid design, and the performance is good, I wouldn't worry too much about it.

Perhaps you could get a data dump to test the performance in a live scenario.

Conrad
+4  A: 

As an example, I have about 3 GB of text in a Lucene index, and it functions very quickly (milliseconds response times on searches, filters, and sorts). This index contains about 300,000 documents.

Hope that gave some context to your concerns. This is in a production environment.

Jeff Meatball Yang
+2  A: 

Lucene is very mature and has very good performance for what it was designed to do. However, it is not an RDBMS. The amount of fine-tuning you can do to improve performance is more limited than a database engine.

You shouldn't rely only on lucene if:

  • You need frequent updates
  • You need to do joined queries
  • You need sophisticated backup solutions

I would say that if your project is large enough to hire a DBA, you should use one...

Performance wise, I am seeing acceptable performance on a 400GB index across 10 servers (a single (4GB, 2CPU) server can handle 40GB of lucene index, but no more. YMMV).

itsadok
What do you consider "acceptable performance"?
Avi
+1  A: 

By excessive, do you mean extensive/exclusive?

Lucene's performance is generally very good. I recently ran some performance tests for Lucene on my Desktop with QuadCore @ 2.4 GHz 2.39 GHz

I ran various search queries against a disk index composed of 10MM documents, and the slowest query (MatchAllDocs) returned results within 1500 ms. Search queries with two or more search terms would return around 100 ms.

There are tons of performance tweaks you can do for Lucene, and they can significantly increase your search speed.

Cambium
A: 

We use lucence to enable type-ahead searching. This means for every letter typed, it hits the lucence index to get the results. Multiple that to tens of textboxes on multiple interfaces and again tens of employees typing, with no complaints and extremely fast response times. (Actually it works faster than any other type-ahead solution we tried).

Yasir Laghari