tags:

views:

83

answers:

1

Here's how Paul Graham describes the ranking algorithm for Hacker News:

News.YC's is just

(p - 1) / (t + 2)^1.5

where p = points and t = age in hours

I'd like to do that in pure mySQL given the following tables:

  • Table Posts with fields postID (index) and postTime (timestamp).
  • Table Votes with fields voteID (index), postID, and vote (integer, 0 or 1).

The idea of the vote field is that votes can be rescinded. For the purposes of the ranking, vote=0 is equivalent to no vote at all. (All votes are upvotes, no such thing as downvotes.)

The question is how to construct a query that returns the top N postIDs, sorted by Paul Graham's formula. There are approximately 100k posts altogether so if you think caching of the scores or anything will be needed, I'd love to hear advice about that.

(Obviously this is not rocket science and I can certainly figure it out but I figured someone who eats SQL for breakfast, lunch, and dinner could just rattle it off. And it seems valuable to have available on StackOverflow.)


Related questions:

+3  A: 

Untested:

  SELECT x.*
    FROM POSTS x
    JOIN (SELECT p.postid, 
                 SUM(v.vote) AS points
            FROM POSTS p
            JOIN VOTES v ON v.postid = p.postid
        GROUP BY p.postid) y ON y.postid = x.postid
ORDER BY (y.points - 1)/POW((((UNIX_TIMESTAMP(NOW()) - UNIX_TIMESTAMP(x.timestamp))/3600)+2, 1.5) DESC
   LIMIT n

I can't test because VMWare decided to release v7.1.2 :/

OMG Ponies