Why do sites like stackoverflow with badges use some type of delayed job to determine when to award a new badge?

views:

173

answers:

+4 Q:

Why do sites like stackoverflow with badges use some type of delayed job to determine when to award a new badge?

Stackoverflow has a nifty badge system. One thing I noticed is that badges are not immediately awarded, but sometimes seem to have some type of a delay after I meet the criteria. I've noticed this on some other sites that have badges as well.

Presumably this is because they are using a delayed job that scans periodically to see if any new badges need to be awarded. I see this approach also advised here:
http://stackoverflow.com/questions/3162446/how-to-implement-badges

However, I don't really see why this should be necessary, and am favoring in my implementation to simply have a system where after a relevant action is performed, for example a new comment is posted, a checkAwardBadge function is called, which checks if the user meets the criteria for a new comment badge.

Speedwise, I was thinking that all relevant user stats would simply be stashed in a submodel of User, like UserStats so that instead of having to count the number of comments each time, it would just be a simple query.

It strikes me that the system I'm favoring should be fast and very simple to understand. Are there downsides I'm missing here on why it's necessary to complicate things with delayed jobs?

To clarify: I plan to have an abstract class Achievements, with each actual Achievement an implementation of Achievements. Each Achievement will have a checkAwardBadge function, which can be called from the controller, or even a delayed job if I should choose to go that route, or any time really, to check whether a user has earned a certain badge. Thus, achievement code would all be centralized.

+2 A:

It could be so that if an action is done and immediately undone, it won't result in a badge being awarded.

Justin 2010-07-29 19:12:57

+1 A:

I always assumed the delay was because it is faster to serve static content. I think this is common on high traffic sites, periodically update static content instead of generating it for each web request.

The periodic job would just generate new static content, and would run very frequently, but less frequently than every single page request.

Brandon Horsley 2010-07-29 19:13:12

+7 A:

Your implementation may work on simple scenarios (like the one you are describing), but if things gets more complex you have a solution that:

Makes unnecessary checks in every action
Adds penalty performance to every action
Does not scale
Does not have a central place for all the rules.

Eduardo Molteni 2010-07-29 19:13:41

(1) and (2) the performance penalty would be very slight on this. Only a simple one row query to the database, followed by some simple logic like (X > 30)(3) Could you explain why this wouldn't scale(4) Actually it would. I plan to create an Achivements class, with each Achievement a sub class. I would then only need to add a single line to the controller code in most cases to make a call to checkAwardBadge on the relevant badge.

WIlliam Jones 2010-07-29 19:18:37

(3) Just an example: Suppose that you have one action like deleting a comment, that does not fire any Badge, but later you need to award a badge when the user delete a comment, you have to go back to find all code that delete a Comment and add the call to checkAwardBadge. Imagine it in more complex scenarios.

Eduardo Molteni 2010-07-29 19:30:40

+2 A:

While this only loosely parallels the scenario you're describing, I feel that discussing what we do at my job might help illuminate part of the reasoning for this approach.

I work for a real-time algorithmic trading company. Part of what our software does is process market data from a vendor.

Now, there's stuff that needs to happen in response to every individual market tick. We run analytics, have safety triggers that take effect in certain cases, etc. But what we avoid doing at all costs is bloating the code that reacts to market events with all of this "secondary" logic.

The reasoning here is that our data comes over the network from a data vendor, and we need this data feed to be flowing freely without any back-up. Our software may handle somewhere around 10,000 market ticks per second. If it takes too long to process those market events, the feed starts to get clogged and our ability to react to the market as rapidly as possible becomes compromised.

The consequence of this is that our code that handles new market events is extremely lean. An event updates a price and that's it. As for all of the other logic that needs to run for each event: that happens on a periodic basis, via a queue of all events that have yet to be examined by this logic.

This allows us to have one thread that is extremely responsive and does not get backed up with data, while another handles incoming events and performs more significant computations with them. Splitting the work into two parts in this way keeps everything running smoothly.

I admit this is only tangentially related to your question, but it seems to me the reasoning for not checking badge-related logic on every user action could very well be the same. You don't want to slow down every operation on the server by executing less-than-critical logic at the precise moment the operation takes place. The general strategy is to keep your fast operations fast (i.e., basically all user actions) and to delegate more time-consuming work to secondary processes that run, maybe often, but not for every such operation.

Dan Tao 2010-07-29 19:29:57

+1 absolutely agreed, and very good way of explaining the rationale, I think.

Gian 2010-07-29 19:35:56

ansaurus

tags:

views:

answers:

Why do sites like stackoverflow with badges use some type of delayed job to determine when to award a new badge?

related questions