views:

324

answers:

1

Using the distance logic from this SO post, I'm getting back a properly-filtered set of objects with this code:

class LocationManager(models.Manager):
    def nearby_locations(self, latitude, longitude, radius, max_results=100, use_miles=True):
        if use_miles:
            distance_unit = 3959
        else:
            distance_unit = 6371

        from django.db import connection, transaction
        cursor = connection.cursor()

        sql = """SELECT id, (%f * acos( cos( radians(%f) ) * cos( radians( latitude ) ) *
        cos( radians( longitude ) - radians(%f) ) + sin( radians(%f) ) * sin( radians( latitude ) ) ) )
        AS distance FROM locations_location HAVING distance < %d
        ORDER BY distance LIMIT 0 , %d;""" % (distance_unit, latitude, longitude, latitude, int(radius), max_results)
        cursor.execute(sql)
        ids = [row[0] for row in cursor.fetchall()]

        return self.filter(id__in=ids)

The problem is I can't figure out how to keep the list/ queryset sorted by the distance value. I don't want to do this as an extra() method call for performance reasons (one query versus one query on each potential location in my database). A couple of questions:

  1. How can I sort my list by distance? Even taking off the native sort I've defined in my model and using "order_by()", it's still sorting by something else (id, I believe).
  2. Am I wrong about the performance thing and Django will optimize the query, so I should use extra() instead?
  3. Is this the totally wrong way to do this and I should use the geo library instead of hand-rolling this like a putz?
+1  A: 

To take your questions in reverse order:

Re 3) Yes, you should definitely take advantage of PostGIS and GeoDjango if you're working with geospatial data. It's just silly not to.

Re 2) I don't think you could quite get Django to do this query for you using .extra() (barring acceptance of this ticket), but it is an excellent candidate for the new .raw() method in Django 1.2 (see below).

Re 1) You are getting a list of ids from your first query, and then using an "in" query to get a QuerySet of the objects corresponding to those ids. Your second query has no access to the calculated distance from the first query; it's just fetching a list of ids (and it doesn't care what order you provide those ids in, either).

Possible solutions (short of ditching all of this and using GeoDjango):

  1. Upgrade to Django 1.2 beta and use the new .raw() method. This allows Django to intelligently interpret the results of a raw SQL query and turn it into a QuerySet of actual model objects. Which would reduce your current two queries into one, and preserve the ordering you specify in SQL. This is the best option if you are able to make the upgrade.

  2. Don't bother constructing a Django queryset or Django model objects at all, just add all the fields you need into the raw SQL SELECT and then use those rows direct from the cursor. May not be an option if you need model methods etc later on.

  3. Perform a third step in Python code, where you iterate over the queryset and construct a Python list of model objects in the same order as the ids list you got back from the first query. Return that list instead of a QuerySet. Won't work if you need to do further filtering down the line.

Carl Meyer
Thanks, that's all about what I figured. If I can abuse your good nature one more time: I've done #2 before, I just felt dirty about it. Is there anyway to just build a queryset object and stick the items into it in the order I want? I'm going to go with either 2 or 3 for now (site's launching tonight/ tomorrow), but I'd like to know if that's possible.
Tom
Possible? Certainly. Easy or simple or clean? No. There's no reason to do that. If you need a dirty hack for now, use #2 or #3. Later on the right option is #1 or GeoDjango anyway.
Carl Meyer