views:

40

answers:

2

I have 2 tables, 'interests' and 'users_interests'.

'users_interests' just has userid and interestid fields. 'interests just has an id and a name.

I simply need to find userid's who have more than 3 interest ID's in common. I have been told that a Self Join is involved, but I cannot seem to get this to work.

Someone said something like this could work:

SELECT 
      others.userid 
  FROM interests AS user 
  JOIN interests AS others 
      USING(interestid) 
  WHERE user.userid = 2 
  GROUP BY 
      others.userid 
  ORDER BY COUNT(*) DESC

But I'm having no luck with it.

+2  A: 

You said more than 3 interest IDs in common, so you mean “at least 4”, right?

SELECT first1.userid, second1.userid
FROM users_interests first1, users_interests second1,
     users_interests first2, users_interests second2,
     users_interests first3, users_interests second3,
     users_interests first4, users_interests second4
WHERE
    first2.userid=first1.userid AND first3.userid=first1.userid AND first4.userid=first1.userid AND
    second2.userid=second1.userid AND second3.userid=second1.userid AND second4.userid=second1.userid AND
    first1.userid<>second1.userid AND
    first1.interestid=second1.interestid AND
    first2.interestid=second2.interestid AND first2.interestid<>first1.interestid AND
    first3.interestid=second3.interestid AND first3.interestid<>first2.interestid AND first3.interestid<>first1.interestid AND
    first4.interestid=second4.interestid AND first4.interestid<>first3.interestid AND first4.interestid<>first2.interestid AND first4.interestid<>first1.interestid

Since I haven’t tested this, please remember that there may be errors in it, so only use it if you understand it.

If you need the same for other numbers of interests in common, I’m sure you can write code to dynamically generate this query for any number. Also, if you need the interest names, I’m sure you’ll be able to add the necessary four joins to the interests table and add the relevant columns to the SELECT clause.

Timwi
+3  A: 
SELECT ui.userid, COUNT(*) AS common_interests
FROM users_interests ui
WHERE ui.interestid IN (
    SELECT ui2.interestid FROM users_interests ui2 WHERE ui2.userid = 2
) 
AND ui.userid <> 2
GROUP BY ui.userid
HAVING common_interests > 3;

Note the occurence of the userid we're basing our search on (2) on two places in the code

Simen Echholt
Amazing! Thank you for this, this works beautifully!
Ryan