Consider following assumptions:
- I have Java 5.0 Web Application for which I'm considering to use Lucene 3.0 for full-text searching
- There will be more than 1000K Lucene documents, each with 100 words (average)
- New documents must be searchable just after they are created (real time search)
- Lucene documents have frequently updating integer field named quality
Where to find code examples (simple but as complete as possible) of near real time search of Lucene 3.0?
Is it possible to obtain query results sorted by one of document fields (quality) which may be updated frequently (for already indexed document)? Such updating of document field will have to trigger Lucene index rebuilding? What is performance of such rebuilding? How to done it efficiently - I need some examples / documentation of complete solution.
If, however, index rebuilding is not necessarily needed in this case - how to sort search results efficiently? There may be queries returning lots of documents (>50K), so I consider it unefficient to obtain them unsorted from Lucene and then sort them by quality field and finally divide sorted list to pages for pagination.
Is Lucene 3.0 my best choice within Java or should I consider some other frameworks/solutions? Maybe full text search provided by SQL Server itself (I'm using PostgreSQL 8.3)?