views:

93

answers:

2

I'm implementing a Stackoverflow-like reputation system on my rap lyrics explanation site, Rap Genius:

  • Good explanation: +10
  • Bad explanation: -1
  • Have the most explanations on a song: +30

My question is how to implement this. Specifically, I'm trying to decide whether I should create a table of reputation_events to aid in reputation re-calculation, or whether I should just recalculate from scratch whenever I need to.

The table of reputation_events would have columns for:

  • name (e.g., "good_explanation", "bad_explanation")
  • awarded_to_id
  • awarded_by_id
  • awarded_at

Whenever something happens that affects reputation, I insert a corresponding row into reputation_events. This makes it easy to recalculate reputation and to generate a human-readable sequence of events that generated a given person's reputation.

On the other hand, any given action could affect multiple user's reputation. E.g., suppose user A overtakes user B on a given song; based on the "Have the most explanations on a song" goal, I would have to remember to delete B's original "has_the_most_explanations" event (or maybe I would add a new event for B?)

A: 

I would do a reputation event list for the purpose of recalculation and being able to track down why the total rep value is what it is.

But why have a "name" column, why not just have a value with either a positive or negative int?

This table will get huge, make sure you cache.

jordanstephens
+2  A: 

In general, I never like data to exist in more than one place. It sounds like your "reputation_events" table would contain data that can be calculated from other data. If so, I'd recalculate from scratch, unless the performance impact becomes a real problem.

When you have calculated data stored, you have the possibility that it may not correspond correctly with the base data -- basically a corrupted state. Why even make it possible if you can avoid it?

JacobM