views:

622

answers:

2

Hi all,

I am writing a fairly simple Django application where users can enter string queries. The application will the search through the database for this string.

Entry.objects.filter(headline__contains=query)

This query is pretty strait forward but not really helpful to someone who isn't 100% sure what they are looking for. So I expanded the search.

from django.utils import stopwords

results = Entry.objects.filter(headline__contains=query)
if(!results):
    query = strip_stopwords(query)
    for(q in query.split(' ')):
        results += Entry.objects.filter(headline__contains=q)

I would like to add some additional functionality to this. Searching for miss spelled words, plurals, common homophones (sound the same spelled differently), ect. I was just wondering if any of these things were built into Djangos query language. It isn't important enough for me to write a huge algorithm for I am really just looking for something built in.

Thanks in advance for all the answers.

+4  A: 

You could try using python's difflib module.

>>> from difflib import get_close_matches
>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
['apple', 'ape']
>>> import keyword
>>> get_close_matches('wheel', keyword.kwlist)
['while']
>>> get_close_matches('apple', keyword.kwlist)
[]
>>> get_close_matches('accept', keyword.kwlist)
['except']

Problem is that to use difflib one must build a list of words from the database. That can be expensive. Maybe if you cache the list of words and only rebuild it once in a while.

Some database systems support a search method to do what you want, like PostgreSQL's fuzzystrmatch module. If that is your case you could try calling it.


edit:

For your new "requirement", well, you are out of luck. No, there is nothing built in inside django's query language.

nosklo
+2  A: 

djangos orm doesnt have this behavior out-of-box, but there are several projects that integrate django w/ search services like:

i cant speak to how well options #2 and #3, but ive used django-sphinx quite a lot, and am very happy with the results.

mattdennewitz