views:

148

answers:

4

I am looking to design a database for a website where users will be able to gain points (reputation) for performing certain activities and am struggling with the database design.

I am planning to keep records of the things a user does so they may have 25 points for an item they have submitted, 1 point each for 30 comments they have made and another 10 bonus points for being awesome!

Clearly all the data will be there, but it seems like a lot or querying to get the total score for each user which I would like to display next to their username (in the form of a level). For example, a query to the submitted items table to get the scores for each item from that user, a query to the comments table etc. If all this needs to be done for every user mentioned on a page.... LOTS of queries!

I had considered keeping a score in the user table, which would seem a lot quicker to look up, but I've had it drummed into me that storing data that can be calculated from other data is BAD!

I've seen a lot of sites that do similar things (even stack overflow does similar) so I figure there must be a "best practice" to follow. Can anyone suggest what it may be?

Any suggestions or comments would be great. Thanks!

+1  A: 

For very high read/write ratios, denormalizing is a very valid option. You can use an indexed view and the data will be kept in sync declaratively (so you never have to worry about there being bad score data). The downside is that it IS kept in sync.. so the updates to the store total are a synchronous aspect of committing the score action. This would normally be quite fast, but it is a design decision. If you denormalize yourself, you can choose if you want to have some kind of delayed update system.

Personally I would go with an indexed view for starting, and then later you can replace it fairly seamlessly with a concrete table if your needs dictate.

Sorry, I'm dumb - didn't notice the mysql aspect. I don't think that has indexed views yet. The normal alternative to force it to keep in sync are to setup triggers.
Triggers make baby jesus cry. Well, they make me cry, anyway. Even if it's not indexed, a view is your friend here.
Adrien
A: 

In the past we've always used some sort of nightly or perodic cron job to calculate the current score and save it in the database - sort of like a persistent view of the SUM on the activities table. Like most "best practices" they are simply guidelines and it's often better and more practical to deviate from a specific hard nosed practice on very specific areas.

Plus it's not really all that much of a deviation if you use the cron job as it's better viewed as a cache stored in the database.

Paul Alexander
A: 

If you have a separate scores table, you could update it each time an item is submitted or a comment is posted by a user. You could do this using a trigger or within the sites code.

The user scores would be updated continuously, and could be quickly queried for display.

stukelly
+2  A: 

I think that this is definitely a great question. I've had to build systems that have similar behavior to this--especially when the table with the scores in it is accessed pretty often (like in your scenario). Here's my suggestion to you:

First, create some tables like the following (I'm using SQL Server best practices, but name them however you see fit):

UserAccount          UserAchievement
 -Guid (PK)           -Guid (PK)
 -FirstName           -UserAccountGuid (FK)
 -LastName            -Name
 -EmailAddress        -Score

Once you've done this, go ahead and create a view that looks something like the following (no, I haven't verified this SQL, but it should be a good start):

SELECT [UserAccount].[FirstName]      AS FirstName,
       [UserAccount].[LastName]       AS LastName,
       SUM([UserAchievement].[Score]) AS TotalPoints
FROM [UserAccount]
INNER JOIN [UserAchievement]
     ON [UserAccount].[Guid] = [UserAchievement].[UserAccountGuid]
GROUP BY [UserAccount].[FirstName],
         [UserAccount].[LastName]
ORDER BY [UserAccount].[LastName] ASC

I know you've mentioned some concern about performance and a lot of queries, but if you build out a view like this, you won't ever need more than one. I recommend not making this a materialized view; instead, just index your tables so that the lookups that you need (essentially, UserAccountGuid) will enable fast summation across the table.

I will add one more point--if your UserAccount table gets huge, you may consider a slightly more intelligent query that would incorporate the names of the accounts you need to get roll-ups for. This will make it possible not to return huge data sets to your web site when you're only showing, you know, 3-10 users' information on the page. I'd have to think a bit more about how to do this elegantly, but I'd suggest staying away from "IN" statements since this will invoke a linear search of the table.

Ed Altorfer