views:

302

answers:

8

I'm trying to come up with a way to determine how "hot" certain threads are in a forum. What criteria would you use and why? How would these come together to give a hotness score?

The criteria I'm thinking of include:

  • how many replies
  • how long since the last reply
  • average time between replies

The problems this algorithm must solve:

  • A thread which has 500 replies is clearly hot, unless the last reply was over a year ago.
  • A thread with 500 replies that was replied to a second ago is clearly hot, unless it's taken 4 years to reach 500 replies.
  • A thread with 15 replies in the last 4 minutes is really hot!

Any ideas, thoughts or complete solutions out there?

+3  A: 

this might be what you are looking for:

http://stackoverflow.com/questions/32397/popularity-algorithm

Kevin
A: 

I was thinking you could probably model it with diminishing waves here, using amplitude (or root mean square) to measure hotness. As time goes, the wave diminishes, and so a late reply will only cause a little stir.

In practice, I think this requires a lot of calculation. You could make good use of caching to speed up the calculation.

Just my two cents.

csl
+1  A: 

Simplest algorithm: If there have been greater than X replies since Y, it is hot.

If you prefer something that scales, just count how many replies since time y. More replies means more hotness.

Brian
A: 

In short I've found logarithmic decay of "hotness" to be the most natural.

Robert Gould
+2  A: 

Jeff Atwood has a nice question about this with a ton of information on other "hot" algorithms. I suggest using one of those and adapting it to your liking.

EndangeredMassa
A: 

Thanks to those who posted the links to the other questions/answers. Unfortunately, those equations take a lot more things into consideration than what's possible with my setup (eg: voting, reputation of author, etc)

After playing around with it, I've come up with this equation which I'll use for the time being:

log10($numOfReplies * 20000 / pow($timeSinceLastPost, 1.3))

It still could use some work. For example, if there's a really really popular but old thread, it'll be low on the hotness, but if one person replies to it that'll put it right back to the top for a few days/weeks.

nickf
A: 

Why not just use a sort of exponential decay model. Hotness of thread = sum( k^(time since posting) ) for all posts. This has the advantage of being really easy to update and calculate. You'd have to play around with k and your unit of time measurement (k should be < 1, but fairly close to it)

Current hotness = hotness at time of last post * k^(time since last post).
Hotness after new post = current hotness + 1

Yuliy
A: 

One thing you should pay some attention to is whether people might want to "game" the algorithm in order to make/keep their threads "hot". Actually, you can pretty much assume that they will.

The minimum you should do to discourage this is to only consider replies from different people.

Michael Borgwardt