tags:

views:

108

answers:

3

Hello,

I am using sphinx as a search engine on my website its working perfect and I have no complain with it. The only thing it lacks is, it does not allow me to search articles whose query length is more than 15 words. I know in reality people don't use more than 3-4 words i want to use it for finding duplicate contents.

I was wondering if there is any alternative solution to sphinx. I want to cope with duplicate contents.

My main articles table is in innodb but I am also caching articles into MyISAM table as well for full text searching but when I search an article it takes ages to perform one search. Its not the query problem, i think mysql lacks the fulltext searching facility.

Thanks Jason

+1  A: 

Apache Solr is an alternative. It's based on Apache's Lucene project...

you might want to check Lucene as well.

And since you're using MySQL, check it's full-text searching MySQL Full Text Searching

sobedai
@stereofrog you are right. I had the old php api thats why it was not allowing me to use the full query. Thanks
Jason
A: 

Check Zend_Search_Lucene as well: http://framework.zend.com/manual/en/zend.search.lucene.html

Though it's slower than sphinx.

Richard Knop
A: 

Perhaps not helpful, but could you simply add a unique index to the MySQL field to prevent insertion of duplicates?

I have not come across any query length limitations in the Sphinx version I'm using (0.9.9), but maybe I have not tried hard enough.

BMA
I am trying to find plagiarized contents so adding unique field is not a good option. I can now insert full queries in sphinx but now it keeps crashing :(
Jason