views:

326

answers:

4

Hello I am wondering what algorithm would be clever to use for a tag driven e-commerce enviroment:

  • Each item has several tags. IE:

    Item name: "Metallica - Black Album CD", Tags: "metallica", "black-album", "rock", "music"

  • Each user has several tags and friends(other users) bound to them. IE:

    Username: "testguy", Interests: "python", "rock", "metal", "computer-science" Friends: "testguy2", "testguy3"

I need to generate recommendations to such users by checking their interest tags and generating recommendations in a sophisticated way.

Ideas:

  • A Hybrid recommendation algorithm can be used as each user has friends.(mixture of collaborative + context based recommendations).
  • Maybe using user tags, similar users (peers) can be found to generate recommendations.

  • Maybe directly matching tags between users and items via tags.

Any suggestion is welcome. Any python based library is also welcome as I will be doing this experimental engine on python language.

+7  A: 

1) Weight your tags.

Tags fall into several groups of interest:

  • My tags that none of my friends share
  • Tags a number of my friends share, but I don't
  • My tags that are shared by a number of my friends.

(sometimes you may want to consider friend-of-a-friend tags too, but in my experience the effort hasn't been worth it. YMMV.)

Identify all tags that the person and/or the person's friends have in interests, and attach a weight to the tags for this individual. One simple possible formula for tag weight is

(tag_is_in_my_list) * 2 + (friends_with_tag)/(number_of_friends)

Note the magic number 2, which makes your own opinion worth twice as much as that of all of your friends put together. Feel free to tweak :-)

2) Weight your items

For each item that has any of the tags in your list, just add up all of the weighted values of the tags. A higher value = more interest.

3) Apply a threshold.

The simplest way is to show the user the top n results.

More sophisticated systems also apply anti-tags (i.e. topics of non-interest) and do many other things, but I have found this simple formula effective and quick.

whybird
Thanks alot for your kind answer! I am a bit confused about the formula: What does exactly tag_is_in_my_list mean? if it some kind of binary as 1 or 0 ? Cheers
Hellnar
Yes, a binary 1 or 0.
whybird
+2  A: 

If you can, track down a copy of O'Reilly's Programming Collective Intelligence, by Toby Segaran. There's a model solution in it for exactly this problem (with a whole bunch of really, really good other stuff).

regularfry
A: 

Your problem is similar to product recommendation engines, such as Amazon's well publicized site. These use a learning algorithm called association rules, which basically build a conditional probability of user X buying product Y based on common features Z between the user and product. A lot of open source toolkits implement association rules, such as Orange and Weka.

Chris S
A: 

You can use the Python Semantic module for Drools to specify your rules in python scripting language. You can accomplish this easily using Drools. It is a terrific rules engine that we used to solve several recommendation engines.

CodeToGlory