views:

199

answers:

3

I have a QuerySet that I wish to pass to a generic view for pagination:

links = Link.objects.annotate(votes=Count('vote')).order_by('-created')[:300]

This is my "hot" page which lists my 300 latest submissions (10 pages of 30 links each). I want to now sort this QuerySet by an algorithm that HackerNews uses:

(p - 1) / (t + 2)^1.5
p = votes minus submitter's initial vote
t = age of submission in hours

Now because applying this algorithm over the entire database would be pretty costly I am content with just the last 300 submissions. My site is unlikely to be the next digg/reddit so while scalability is a plus it is required.

My question is now how do I iterate over my QuerySet and sort it by the above algorithm?

For more information, here are my applicable models:

class Link(models.Model):
    category = models.ForeignKey(Category, blank=False, default=1)
    user = models.ForeignKey(User)
    created = models.DateTimeField(auto_now_add=True)
    modified = models.DateTimeField(auto_now=True)
    url = models.URLField(max_length=1024, unique=True, verify_exists=True)
    name = models.CharField(max_length=512)

    def __unicode__(self):
        return u'%s (%s)' % (self.name, self.url)

class Vote(models.Model):
    link = models.ForeignKey(Link)
    user = models.ForeignKey(User)
    created = models.DateTimeField(auto_now_add=True)

    def __unicode__(self):
        return u'%s vote for %s' % (self.user, self.link)

Notes:

  1. I don't have "downvotes" so just the presence of a Vote row is an indicator of a vote or a particular link by a particular user.

EDIT

I think I have been overcomplicating things and found out a nifty little piece of code:

links = Link.objects.annotate(votes=Count('vote')).order_by('-created')[:300]
for link in links:
    link.popularity = ((link.votes - 1) / (2 + 2)**1.5)

But for the life of me I cannot get that to translate to my templates:

{% for link in object_list %}
    Popularity: {{ link.popularity }}
{% endfor %}

Why is it not showing up? I know popularity is working because:

print 'LinkID: %s - Votes: %s - Popularity: %s' % (link.id, link.votes, link.popularity)

returns what I'd expect in console.

+3  A: 

You can make a values dict or values list from your QuerySet if it's possible and apply your sorting algorithm to the dict(list) obtained. See

http://docs.djangoproject.com/en/dev/ref/models/querysets/#values-fields

http://docs.djangoproject.com/en/dev/ref/models/querysets/#values-list-fields

Example

# select links
links = Link.objects.annotate(votes=Count('vote')).order_by('-created')[:300]
# make a values list:
links = links.values_list('id', 'votes', 'created')
# now sort 
# TODO: you need to properly format your created date (x[2]) here
list(links).sort(key = lambda x: (x[1] - 1) / (x[2] + 2)^1.5)
dragoon
Could I bother you to provide an example? Perhaps adding 1 to every `votes` value. I'm having a hard time grasping this concept.
TheLizardKing
Ok, I have updated my answer with the example
dragoon
Thank you for the example! I am running into an issue though `'ValuesListQuerySet' object has no attribute 'sort'` am I missing an import?
TheLizardKing
Hmm, I suppose because it's not quite a list why sort isn't an available option.
TheLizardKing
Yes, I updated my example, just make a list of it
dragoon
Getting closer methinks: `KeyError at /links/1`Hmm
TheLizardKing
But I will say I need to pass my QS to a generic view so I'm not sure turning it into a list is the best option.
TheLizardKing
So may be you should add redundancy and calculate link rank in extra column?
dragoon
+1  A: 
qs = [obj1, obj2, obj3] # queryset
s = [] # will hold the sorted items
for obj in qs:
    s.append(((obj.votes-1)/pow((obj.submision_age+2), 1.5), obj))
s.sort()
s.reverse()

Where s should end up sorted from highest calculated importancy first lowest last and looking like:

[(calculated importancy, obj), (calculated importancy, obj), ...]

rebus
Since I am passing this calculated list to my template through a Generic View I think It needs to stay a QS.
TheLizardKing
Don't use generic view comes to mind as a solution. It's easy to write your own view which returns sorted list in Context and i don't see a need for pagination on 5 items. Maybe could also to this on database level with `raw()` thus making the database calculate popularity and sort by it.
rebus
Doing it on a DB level has proven to be.... difficult. http://stackoverflow.com/questions/1965341/implementing-a-popularity-algorithm-in-djangoWhile folks have provided answers to that question by using .extra they don't actually seem to work.
TheLizardKing
raw() works a bit different the extra, it returns the QuerySet for plain SQL. And it would be much faster done with SQL i think. http://docs.djangoproject.com/en/dev/topics/db/sql/#topics-db-sql
rebus
Haha, sigh, I'll have to get the development version I guess.
TheLizardKing
Maybe :), Django 1.2 is in RC1 i think, it should be out pretty soon.
rebus
A: 

While was unable to calculate over a QuerySet, instead I had to convert into a list of sorts

links = Link.objects.select_related().annotate(votes=Count('vote'))
for link in links:
    delta_in_hours = (int(datetime.now().strftime("%s")) - int(link.created.strftime("%s"))) / 3600
    link.popularity = ((link.votes - 1) / (delta_in_hours + 2)**1.5)

links = sorted(links, key=lambda x: x.popularity, reverse=True)

Not optimal but it works. I can't use my lovely object_list generic view with it's automatically pagination and have to resort to doing it manually but it's a fair compromise to having a working view...

TheLizardKing