I've been using nutch for a while,untile recently that I know about this resort.
How is its performance,and what's the file size limit it can support?
Besides,how to delete or update an index instead of re-index each time there is a modification?
I've been using nutch for a while,untile recently that I know about this resort.
How is its performance,and what's the file size limit it can support?
Besides,how to delete or update an index instead of re-index each time there is a modification?
Zend_Search_Lucene
is a pure PHP implementation of the Apache Lucene format. The currently (starting from ZF 1.6) supported Lucene index format versions are 1.4 - 2.3. For more information on Lucene, visit http://lucene.apache.org/java/docs/.
As far as index size limits are concerned, the index size is limited by 2GB for 32-bit platforms, and as far as I know, not limited on 64-bit platforms.
Performance is largely variant based to how you build your indexes. Make sure to check the section of the manual that deals with performance.
Also, Luke (a diagnostic tool for Lucene indexes) comes in really handy in performance optimization and troubleshooting.
P.S. With regards to updating, the Lucene index file format doesn't support document updating. Documents should be removed and re-added to the index to effectively update them. This is true for the Java implementation as well.