views:

38

answers:

1

I am building search app using django & sphinx. I got the setup working but when I search I get irrelevant results. Here is what I do -

# this is in my trial_data Model
search     = SphinxSearch(
                index    = 'trial_data trial_datastemmed',
                weights  = {'name': 100,},
                mode     = 'SPH_MATCH_ALL',
                rankmode = 'SPH_RANK_BM25',
                )

When I search I get this (from my trial data) -

from trial.models import *
res = trial_data.search.query('godfather')
for i in res: print i
3 Godfathers (7.000000)
Bonanno: A Godfather's Story (1999) (6.400000)
Disco Godfather (4.300000)
Godfather (6.100000)
Godfather: The Legend Continues (0.000000)
Herschell Gordon Lewis: The Godfather of Gore (2010) (6.900000)
Mafia: Farewell to the Godfather (0.000000)
Mumbai Godfather (2.600000)
Russian Godfathers (2005) (7.000000)
Stan Tracey: The Godfather of British Jazz (2003) (6.200000)
The Black Godfather (3.500000)
The Burglar's Godfather (0.000000)
The Fairy Godfather (0.000000)
The Fairy Godfather (0.000000)
The Godfather (9.200000)
The Godfather (1991) (6.400000)

the problem is the most relevant result for "godfather" is shown at 19th position. All the top results are junk. How can I order or sort my results using Django-sphinx.

Rather, what can I do to make the results more relevant using this setup.

NOTE: I am using python 2.6.x + django 1.2.x + sphinx 0.99 + django-sphinx 2.3.3 + mysql

Also, the data i custom made & is only about 100 rows with only one field name searchable. There is one more fields rating (which is what you see in brackets). rating field is an attribute (non searchable).

A: 

As far as i can tell, there are two ways of going about this.

Firstly, there are sort modes SPH_SORT_RELEVANCE, SPH_SORT_ATTR_DESC, SPH_SORT_ATTR_ASC, SPH_SORT_TIME_SEGMENTS, SPH_SORT_EXTENDED. I assume that keyword in the SphinxSearch constructor would be sortmode, but I couldn't find the docs.

search     = SphinxSearch(
                index    = 'trial_data trial_datastemmed',
                weights  = {'name': 100,},
                mode     = 'SPH_MATCH_ALL',
                rankmode = 'SPH_RANK_BM25',
                sortmode = 'SPH_SORT_RELEVANCE', # this was added
                )

Secondly, you can specify at time of query the sort mode:

res = trial_data.search.query('godfather').order_by('@relevance')

Both of these answers are guesses from looking at http://djangosnippets.org/snippets/231/. Let us know if it worked for you.

Steven Rumbalski
@steven the `.order_by('@weight')` has no effect here since I am searching with only one field `name`. `@weight` makes sense if you are searching from more than one field. So coming back, I have tried this. It has no effect on the `result_set`.
MovieYoda
@movieyoda I edited my answer to change '@weight' to '@relevance' based on http://www.davidcramer.net/code/79/in-depth-django-sphinx-tutorial.html.
Steven Rumbalski