views:

245

answers:

2

Afternoon chaps,

After my adventures with Zend-Lucene-Search, and discovering it isn't all its cracked up to be when indexing large datasets, I've turned to Solr (thanks to Bill Karwin for that :) )

I've got Solr indexing the db far far quicker now, taking just over 8 minutes to index a table of just over 1.7million rows - which I'm very pleased with.

However, when I come to try and search the index with the Zend port, I run into the following error;

Fatal error: Uncaught exception 'Zend_Search_Lucene_Exception' with message 'Unsupported segments file format' in /var/www/Zend/Search/Lucene.php:407 Stack trace: #0 /var/www/Zend/Search/Lucene.php(555): Zend_Search_Lucene->_readSegmentsFile() #1 /var/www/z_search.php(12): Zend_Search_Lucene->__construct('tmp/feeds_index') #2 {main} thrown in /var/www/Zend/Search/Lucene.php on line 407

I've tried to have a search around but can't seem to find anything about this problem, everyone just seems to be able to get them to work?

Any help as always much appreciated :)

Thanks,

Tom

+1  A: 

Never used Zend before, but I've used Lucene/Solr.

Are you using the same version of Lucene for both the Solr indexing and the Zend port? Check to see what Lucene jar file is being used for each. If they're different, then Solr might be producing a Lucene index that isn't compatible with the Zend port.

bajafresh4life
Chances are that Solr's index versuin is more advanced than Zend's. You may want to consider going an extra step, using Solr for search as well and communicating with PHP via an HTTP interface, such as XML or JSON.
Yuval F
We had considered that, the only problem being that we're unsure on the possibilities of running Jetty/Tomcat on our live server. The plan was to index the db locally then upload it every x days. I'll look into the Lucene versions for both Zend and Solr, and make sure they're singing from the same hymn sheet.
thebluefox
+2  A: 

I confirmed on my machine that a Lucene index created through Solr cannot be read by Zend_Search_Lucene.

Zend_Search_Lucene throws that exception when it detects a Lucene index format that it doesn't support. Looking at the code, Zend currently supports formats pre-2.1, 2.1, and 2.3.

Solr creates an index in format FORMAT_HAS_PROX which as far as I can tell is used by Lucene 2.9 and higher.

Bill Karwin
Ahhh rubbish. Is there no work around then I presume? I'm guessing they'll be no updated Zend code out anytime soon either. Looks like I'll be pushing for that Jetty/Tomcat server.
thebluefox
I think once you get it running, you're bound to be happier with it. The only other suggestion I have is to try to see if you can force Solr to create the Lucene index in 2.3 format. But I don't know how one could do that.
Bill Karwin