The book Programming Collective Intelligence presents a technique for computing similar links/users based on the distance between the links/users in a huge metric space (user x bookmarked this link / link x was bookmarked by this user).

What other techniques have been developed for recommendation engines?


Programming Collective Intelligence: Building Smart Web 2.0 Applications by Toby Segaran is an excellent book. The examples are in Python, but are well described and easy to follow. The first few chapters cover a recommendation engine, and there a discussion of how sites like Amazon do their recommendations.

Thanks. I already have this book. I am now looking for something more advanced and in depth.
+2  A: 

You may want to research what some people having being doing to work towards the Netflix Prize.

+6  A: 

A couple suggestions:

1) Dig through the source and examples for a couple of recommender systems. You might start with Taste and/or Consensus depending on your language preferences. Try to adapt them to your dataset / domain.

2) Dan Lemire's blog is a great resource

3) The papers that come out of the ACM Recommender System conference may be of interest

4) Perhaps less directly related, you may find the Stats202 course on Google Video of use. It was taught by a guy at Google and mirrored the course he was teaching at Stanford at the time. He dives into related subjects such as co-occurence. It is heavily biased towards R, but the concepts are broadly applicable.

5) You might also be interested in Stanford's CS 229 Machine Learning course available online in various formats.

Ryan Cox
+5  A: 

Singular value decomposition has been widely applied as well as various flavours of non-negative matrix factorization. You might also look at latent Dirichlet allocation and other similar probabilistic topic models, for trying to infer topical structure across documents as a way of recommending similar stuff. One of the surprisingly strong (to some) contenders in the Netflix Prize competition was an application of restricted Boltzmann machines.

The team that won the progress prize for Netflix (which was a competition to "build a better movie recommendation system", essentially) used all of these methods, carefully weighted together. They published a document detailing their solution and all the constituent methods, for your reading pleasure.

+1  A: 

A good place to look is the NetFlix Prize. The leaderboard gives links to the sites of the current leaders, many of whom provide very detailed descriptions of their methods. The forum is also a fantastic resource.

+1  A: 

In case anyone is interested, i ported Taste to C# about a year ago (version 1.6). Most of the ported unit tests also run.


Have a look at Kemeny rankings and if you like that approach, ask me in April for a copy of my thesis.

+1  A: 

(I am the author of the Taste [now Apache Mahout collaborative filtering engine) Most of what you will read about recommender systems focuses on the 'canonical' algorithms -- user-based, and item-based recommenders. This is fine, but wanted to point out to anyone that is interested that everyone should also look at "slope one" recommenders. I think it should be considered among the first, most established techniques to investigate. It has some better performance properties.

Sean Owen

Collective intelligence was a mainstream personalization book. I'd suggest looking into Weka (Google it :-) it has a bunch of algorithms in there you can use to find similarity between different data sets.

I'd get onto google scholar and check out personilization and recsys.