Short version:
I have a similar setup to StackOverflow. Users get Achievements. I have many more achievements than SO, lets say on the order of 10k, and each user has in the 100s of achievements. Now, how would you recommend (to recommend) the next achievement for a user to try for?
Long version:
The objects are modeled like this in django (showing only important parts) :
class User(models.Model):
alias = models.ForeignKey(Alias)
class Alias(models.Model):
achievements = models.ManyToManyField('Achievement', through='Achiever')
class Achievement(models.Model):
points = models.IntegerField()
class Achiever(models.Model):
achievement = models.ForeignKey(Achievement)
alias = models.ForeignKey(Alias)
count = models.IntegerField(default=1)
and my algorithm is just to find every other user that has a shared achievement with the logged in user, and then go through all their achievements and sort by number of occurrences :
def recommended(request) :
user = request.user.get_profile()
// The final response
r = {}
// Get all the achievements the user's aliases have received
// in a set so they aren't double counted
achievements = set()
for alias in user.alias_set.select_related('achievements').all() :
achievements.update(alias.achievements.all())
// Find all other aliases that have gotten at least one of the same
// same achievements as the user
otherAliases = set()
for ach in achievements :
otherAliases.update(ach.alias_set.all())
// Find other achievements the other users have gotten in addition to
// the shared ones.
// And count the number of times each achievement appears
for otherAlias in otherAliases :
for otherAch in otherAlias.achievements.all() :
r[otherAch] = r.get(otherAch, 0) + 1
// Remove all the achievements that the user has already gotten
for ach in achievements :
r.pop(ach)
// Sort by number of times the achievements have been received
r = sorted(r.items(), lambda x, y: cmp(x[1], y[1]), reverse=True)
// Put in the template for showing on the screen
template_values = {}
template_values['achievements'] = r
But it takes FOREVER to run, and always returns the whole list, which is unneeded. A user would only need the top few achievements to go after.
So, I'm welcome to recommendations on other algorithms and/or code improvements. I'll give you an achievement in my system for coming up with the recommendation algorithm :)