ansaurus

Question

Collaborative Filtering: Non-Personalized item-to-item similarity

Answer 1

+1 A:

There's a good O'Reilly book on this topic. While the whitepaper might lay the logic out in pseudo-code like that, I don't think that approach would scale very well. The calculations are all probability calculations, so things like Bayes' Theorem get used to say, "Given Person A purchased X, what's the likelihood they purchased Z?" Straightforward looping over the data is working too hard. You have to go through it all for each person.

Tom 2010-03-05 22:32:00

I have the book but its examples all consider something that's rated , in the books case, movies and critics (At least on the chapter on similarity). For instance, "Given my ratings, show me other[movies|critics] that I would like." My data is just purchased, I can derive not purchased if I have to.I dont mind using Bayes but I'm not looking for the likelihood that user A will purchase X. I'm more interested in showing that purchasers of A also bought Z. Disclaimer - I might not understand this as well as I think I do with respects to wanting item-item vs user-item.

Neil Kodner 2010-03-05 22:43:06

@Neil, use purchase/no-purchase as ratings of 0 and 1 -- not maximal computational efficiency, but conceptually shows how, if you know how to deal w/ratings, then **of course** you know how to deal w/just purchases as well! And it _does_ have to use Bayes (or an approx thereof) to make any sense, otherwise you can't cut off the amounts of items to show on the "also purchased" list and you would end up with millions of items there (==totally useless) with a vast catalog and many users, and too large numbers anyway even for a very modest e-commerce site.

Alex Martelli 2010-03-06 03:14:17

ansaurus

tags:

views:

answers:

Collaborative Filtering: Non-Personalized item-to-item similarity

related questions