views:

683

answers:

1
+3  Q: 

Sphinx PHP search

I'm doing a Sphinx search but turning up some really weird results. Any help is appreciated.

So for example if I type "50", I get:

  • 50 Cent
  • 50 Lions
  • 50 Foot Wave, etc.

This is great, but when I search "50 Ce", I get:

  • Ryczące Dwudziestki
  • Spisek
  • Bernhard Gal
  • Cowabunga Go-Go

And other crazy results. Also when I search for "50 Cent", the correct result is at the top, but then random results below. Any ideas why?

PHP code:

$query = $_GET['query'];

if (!empty($query))
{
 $sphinx->SetMatchMode(SPH_MATCH_ALL);
 $sphinx->AddQuery($query, 'artists');
 $sphinx->AddQuery($query, 'variations');

 $sphinx->SetFilter('name', array(3));

 $sphinx->SetLimits(0, 10);

 $result = $sphinx->RunQueries();

 echo '<pre>';

 switch ($result)
 {
  case false:
   echo 'Query failed: ' . $sphinx->GetLastError() . "\n";
   break;
  default:
   if ($sphinx->GetLastWarning())
   {
    echo 'WARNING: ' . $sphinx->GetLastWarning() . "\n";
   }

   if (is_array($result[0]['matches']) && count($result[0]['matches']))
   {
    foreach ($result[0]['matches'] as $value => $info)
    {
     $artist = artistDetails($value);
     echo $artist['name'] . "\n";
    }
   }
 }
}

Sphinx Index and Source:

source artists
{
 type     = mysql

 sql_host    = localhost
 sql_user    = user
 sql_pass    = pass
 sql_db     = db
 sql_port    = 3300

 sql_query    = \
  SELECT \
    id, name \
  FROM artists;

 #UNIX_TIMESTAMP(time)
 #sql_attr_uint   = group_id
 #sql_attr_timestamp  = time

 sql_query_info   = SELECT id,name FROM artists WHERE id=$id
}

index artists
{
 source     = artists
 path     = /var/sphinx/artists
 docinfo     = extern
 charset_type   = utf-8
}
+5  A: 

You need to use the min_prefix_len index config option to tell sphinx that you want it to index and match on partial words. You'll probably also need to set enable_star to 1

http://www.sphinxsearch.com/docs/current.html#conf-min-prefix-len

index artists
{
 source     = artists
 path     = /var/sphinx/artists
 docinfo     = extern
 charset_type   = utf-8
 min_prefix_len   = 2
 enable_star   = 1
}

after enabling prefix indexing you'll be able to search for stuff like "50 Ce*" to get partial word matches. If you want partial word matches to be allowed without requiring your users to know about adding the * themselves you'll probably have to modify the search string programmatically before passing it to sphinx.

Ty W
Words can't describe my thanks :)
James
This is fantastic. I'm not sure if this is only true of newer versions of Sphinx or not, but you may not need to set `enable_star = 1` at all. I didn't modify from the default and my queries worked just the same. Also, for most use cases (not this one) I think having a `min_prefix_len` less than 4 may be unnecessary. If anyone can comment on performance here, I'd be most grateful.
Josh Smith
Ignore what I said about `min_prefix_len`. For search autocompletion, I'm finding it does wonders to have it set to 2.
Josh Smith