views:

412

answers:

4

I have been browsing this site for the answer but I'm still a little unsure how to plan a similar system in its datbase structure and implementation.

In PHP and MySQL it would be clear that some achievements are earned immediatley (when a specialzed action is taken, in SO case: Filled out all profile fields), although I know SO updates and assigns badges after a certain amount of time. With so many users & badges wouldnt this create performance problems (in terms of scale: high number of both users & badges).

So the database structure I assume would something as simple as:

Badges     |    Badges_User      |    User
----------------------------------------------
bd_id      |    bd_id            |  user_id
bd_name    |    user_id          |  etc
bd_desc    |    assigned(bool)   |  
           |    assigned_at      |

But as some people have said it would be better to have an incremental style approach so a user who has 1,000,000 forum posts wont slow any function down.

Would it then be another table for badges that could be incremental or just a 'progress' field in the badges_user table above?

Thanks for reading and please focus on the scaleability of the desired system (like SO thousands of users and 20 to 40 badges).

EDIT: to some iron out some confusion I had assigned_at as a Date/Time, the criteria for awarding the badge would be best placed inside prepared queries/functions for each badge wouldnt it? (better flexibility)

+3  A: 

regarding the sketch you included: get rid of the boolean column on badges_user. it makes no sense there: that relation is defined in terms of the predicate "user user_id earned the badge bd_id at assigned_at".

as for your overall question: define the schema to be relational without regard for speed first (that'll get you rid of half of potential perf. problems, possibly in exchange for different perf. problems), index it properly (what's proper depends on the query patterns), then if it's slow, derive a (still relational) design from that that's faster. like you may need to have some aggregates precomputed, etc.

just somebody
+2  A: 

I think the structure you've suggested (without the "assigned" field as per the comments) would work, with the addition of an additional table, say "Submissions_User", containing a reference to user_id & an incrementing field for counting submissions. Then all you'd need is an "event listener" as per this post and methinks you'd be set.

EDIT: For the achievement badges, run the event listener upon each submission (only for the user making the submission of course), and award any relevant badge on the spot. For the time-based badges, I would run a CRON job each night. Loop through the complete user list once and award badges as applicable.

da5id
Ok this is the best method, and what about in terms of implementation, as in my question SO awards badges after a set time, so every few hours and is it then [foreach(badge)] -> do all users or [foreach(user)] - > do all badges, does this make a difference? I could then set badges which are less likely to be awarded at longer intervals perhaps? thoughts.. and Ill accept as an answer
bluedaniel
Incidentally, I too have a site that I've been thinking of implementing "badges" on, so thanks for helping me think it through :)
da5id
I think its a fantastic idea that really adds a 'wow' factor and only pushes your users to interact and engage your site. Theres a reason every single console game now has them.
bluedaniel
Yes I agree. Like you, I have a "community" site that I put together single-handedly. It's something I've always meant to do, but have always had more pressing issues to take care of first.
da5id
+1  A: 

Hi bluedaniel

I think this is one of those cases where your many-to-many table (Badges_User) is appropriate.
But with a small alteration so that unassigned badges isn't stored.

I assume assigned_at is a date and/or time.
Default is that the user does not have the badges.

Badges     |    Badges_User      |  User
----------------------------------------------
bd_id      |    bd_id            |  user_id
bd_name    |    user_id          |  etc
bd_desc    |    assigned_at      |  
           |                     |

This way only badges actually awarded is stored.
A Badges_User row is only created when a user gets a badge.

Regards
    Sigersted

Sigersted
Humm ... I see that its the same thing da5id is wrting about ...
Sigersted
+1  A: 

I would keep a similar type structure to what you have

Badges(badge_id, badge_name, badge_desc)
Users(user_id, etc)
UserBadges(badge_id, user_id, date_awarded)

And then add tracking table(s) depending on what you want to track and @ what detail level... then you can update the table accordingly and set triggers on it to "award" the badges

User_Activity(user_id, posts, upvotes, downvotes, etc...)

You can also track stats from the other direction too and trigger badge awards

Posts(post_id, user_id, upvotes, downvotes, etc...)


Some other good points are made here

CheeseConQueso