views:

113

answers:

1

I am currently working with User objects -- each of which have many Goal objects. The Goal objects are not User specific, that is, Users can share the same Goal. I am attempting to fashion a way to calculate a "similarity percentage" between two Users... (i.e., taking into account how many Goals they share as well as how many Goals they do not share) Does anyone have experience with this type of situation? I am using Grails with Mysql if that is helpful.

Thanks

+6  A: 

The standard way to do this is the Jaccard similarity. If A is the set of goals of the first user and B is the set of goals of the second user, the Jaccard similarity is:

#(A intersect B)/#(A union B)

This is the number of goals they share divided by the total number of votes the two have together (counting goals that they share only once). So if the first user has goals A={1,2,3} and the second user has goals B={2,4} then it is this:

A intersect B = {2}
A union B = {1,2,3,4}

#(A intersect B)/#(A union B) = 1/4

The Jaccard similarity is always between 0 (they share no goals) and 1 (they have the same goals), so you can get a percentage by multiplying it by 100.

http://en.wikipedia.org/wiki/Jaccard_index

Jules
Worked perfectly, thank you very much
UltraVi01