How to implement a reputation system like Stack Overflow?

I've been looking into implementing a Stackoverflow-like site (for a completely different area of knowledge) for a little bit now, and I have a question on what people think is the best way to implement reputation for a system like this.

Of course, that's a broad topic, so here are some specific questions that I have:

Calculation of Reputation

While I do believe that every action the user takes is persisted in some manner (asking/answering a question, voting on a question/answer, being voted on) which would allow reconstruction of a reputation score from scratch, the more I look at the site, the more implausible it seems that is done every time a reputation score is needed.

To that end, I am of the belief that the user's reputation is only calculated once an interval (every day, two days, week, month, etc, etc) and then activity past that interval is added to the pre-calculated score.

If this is indeed the case, does one think that this would be an automated process that occurs once at a specified interval, or is it something that happens the next time the user tries to perform an action which would affect reputation?

My guess is that because other people can have an impact on your reputation, calculating it when you perform an action that affects reputation is a bad idea, unless that operation is performed every time anyone performs an action that affects your reputation.

Or perhaps I have all of this wrong? Since any one action on the site can only really affect one person's reputation at a time, perhaps the reputation is kept as a running tally and changed every time an action is performed?

After all, the only actions that can really affect a users reputation are upvoting, downvoting, and answer acceptance, it wouldn't seem too hard to actually keep a running total.

Thoughts?

Permissions Based on Reputation

Given that permissions on the site are reputation-based, and it is a fluid system, if a user wobbles back and forth over a permission-boundary, what happens? Do they gain and then lose the permission?

Also, what are the thoughts on the impact of the above questions in relation to this one?

Solution

Eventually I might go with my/Adam Davis' answer, but I will only use that if scalability is an issue. For now, I believe that updating a running total is the best way. I'll post another question though if I find it is not.

Implementation Details

The platform I am developing for is ASP.NET. Specifically, these are the technologies/components/services involved:

ASP.NET
ASP.NET MVC
SQL Server 2008
LINQ-to-SQL
reCAPTCHA (human verification)
Akismet (spam detection)
NValidate (might replace with code contracts at some point)
Microsoft Enterprise Library (specifically the Validation Application Block)
wmd (editor)
Markdown.NET (will be heavily modified in the future)
Microsoft Anti-Cross Site Scripting Library
RPX (for Open ID/Facebook/Google login support)

As for whether or not I will make it open source, it's separated pretty well so that it could easily be modified for any subject. There is some work to be done on this end though, as nothing is templatized, and the controllers are still in the website dll, instead of library dlls.

The controllers will definitely be refactored out into a referenced assembly, but I don't really have any plans to templatize the views. For me, it's just too specific to the site in question.

At this point, this is what I have:

Authentication

The login/logout routines are in place, along with automatic profile generation, but no editing on profiles yet.

Reputation

There is no reputation system in place yet.

Authorization

Since there is no reputation system in place, there is no authorization either.

Questions/Answers

The ability to post a new question is in the system is in place. This included spam analysis, with subsequent CAPTCHA validation if the heuristic says that it is spam. This mechanism is generalized, so it can be applied to all input that is going to be displayed on the site (questions, answers, comments).

Answers is going to be worked on next. Since all the input validation/spam detection is already in place, it's just a matter of linking up the input to the models and getting the validation correct.

The model for past revisions of questions and answers is in the system, but there is no interface for it yet.

After answers, I am going to work on upvoting/downvoting questions/answers. I'm going to have to give some people reputation just to give them the ability to upvote. If everyone starts at zero, then noone has reputation to do vote, nor will they gain any.

Then all the other little things will have to be put in of course, but one day at a time. =)

Update 7/28/2010

Just wanted to let everyone know that calculating the reputation as a running total is working just fine, with no scalability issues. Granted, I don't have the throughput that SO does, but it's not non-existent either.

For those that wish to know, the site is based on the video game "Street Fighter 4", and was developed so that I can easily deploy sites just like it for other game topics (including the upcoming Marvel vs. Capcom 3 and ultimately the recently announced Street Fighter X Tekken game).

The site is:

http://sf4answers.com

@Adam Davis: So you are leaning towards the once-per-interval over the running tally method, correct? If so, then the question is, is the recalculation on a fixed interval separate from the user's actions, or tied to another user's next action after a threshold has been breached?

casperOne 2009-04-09 17:38:28

Once per interval, with the 'real time' score being added to that. How and when it's updated, though, depends on the design of the database and code.

Adam Davis 2009-04-09 17:42:31

@Adam Davis: What are the costs that I am missing with using a running-tally approach then? Since every action that could influence reputation requires a DB operation, what's the harm in updating one more value?

casperOne 2009-04-09 18:01:02

You'll have to run some tests yourself. Keep in mind that your "action table" is going to have hundreds of thousands of new actions a day, and running a sum along one affected user ID for the whole table is a lot more expensive than restricting the sum to a time period.

Adam Davis 2009-04-09 18:26:17

But again, it depends on your DB backend. I doubt my mental model matches your database schema, so I can only guess based on what I know. If you are simply updating a single record per user each time the rep changes, then that would certainly be fast enough.

Adam Davis 2009-04-09 18:27:42

@Adam Davis: Well, I'm thinking I would have a table for votes, with links to what is voted too. That's really all that's pertinent here. If I have to write the "vote" action every time, then when performing that action, why not just update the tally of total reputation when that table is hit?

casperOne 2009-04-09 18:38:07

@Dmitri Nesteruk: And where is this ranking applied? Is it applied to the user, or is it applied to order search results, perhaps? I like the idea, but I need more information on where the ranking would be applied.

casperOne 2009-04-09 17:58:37

@Dmitri Nesteruk: If it IS applied to the user, then one has to think about how reputation is calculated across the board, and that might be a task I'm not keen on taking up.

casperOne 2009-04-09 17:59:28

@Itay Moav: Curious, how far along are you in your implementation?

casperOne 2009-04-09 18:16:12

Somewhere in the middle, check for yourself: phpancake.sourceforge.net/demoNot too much time (two kids and wife to feed :-D )

Itay Moav 2009-04-09 18:23:12

@Itay Moav: You are definitely further along than I am. Do you have spam/captcha installed? Also, it's in PHP? If it was in .NET, I'd suggest combining forces!

casperOne 2009-04-09 18:39:56

dot net, you might want to check ra-ajax, they have a dot net opensource version of SO.I am just now beginning to think on the most none intrusiveness ways to protect my site (see my latest question, it is general, can help you too).

Itay Moav 2009-04-09 18:44:23

@Itay Moav: I've seen it, but I found it to be too different from the SO model in certain areas and I didn't want to hack apart someone else's code. Additionally, there doesn't seem to be anything regarding spam or human verification in there (I might have missed it) and that's a major concern.

casperOne 2009-04-09 19:05:53

@Zifre: I don't think that's a good idea, because then you are going to overload the system trying to run automated processes on the user's time. Also, it gives them input into the processing of the system which isn't really fair. Also, SO has indicated that start-of-day is 00:00 GMT

casperOne 2009-04-09 18:19:54

It possibly lends itself to gaming problems as well.

Adam Davis 2009-04-09 18:29:21

You could prevent gaming by only allowing users to change the setting for the next day, extending the current day until the next start of their day. And you could have only 4-6 automated processes, which should cover anyone's night.

Zifre 2009-04-09 22:24:20

ansaurus

tags:

views:

answers:

How to implement a reputation system like Stack Overflow?

related questions