what is the best way to split an existing Lucene index into two halves i.e. each split should contain half of the total number of documents in the original index
+1
A:
A fairly robust mechanism is to use a checksum of the document, modulo the number of indexes, to decide which index it will go into.
Marcelo Cantos
2010-05-19 13:42:56
+3
A:
The easiest way to split an existing index (without reindexing all the documents) is to:
- Make another copy of the existing index (i.e. cp -r myindex mycopy)
- Open the first index, and delete half the documents (range 0 to maxDoc / 2)
- Open the second index, and delete the other half (range maxDoc / 2 to maxDoc)
- Optimize both indices
This is probably not the most efficient way, but it requires very little coding to do.
bajafresh4life
2010-05-20 13:51:54