views:

266

answers:

5

Hello,

this is my sql problem - there are 3 tables:

Names         Lists                ListHasNames
Id Name       Id Desc              ListsId  NamesId
=--------     ------------         ----------------
1  Paul       1  Football          1        1
2  Joe        2  Basketball        1        2
3  Jenny      3  Ping Pong         2        1
4  Tina       4  Breakfast Club    2        3
              5  Midnight Club     3        2
                                   3        3
                                   4        1
                                   4        2
                                   4        3
                                   5        1
                                   5        2
                                   5        3
                                   5        4

Which means that Paul (Id=1) and Joe (Id=2) are in the Football team (Lists.Id=1), Paul and Jenny in the Basketball team, etc...

Now I need a SQL statement which returns the Lists.Id of a specific Name combination: In which lists are Paul, Joe and Jenny the only members of that list ? Answer only Lists.Id=4 (Breakfast Club) - but not 5 (Midnight Club) because Tina is in that list, too.

I've tried it with INNER JOINS and SUB QUERIES:

SELECT Q1.Lists_id FROM

(
SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Paul') and
  (T1.Id=T2.Names_ID) and
   ( (
     SELECT count(*) FROM
      listhasnames as Z1
     where (Z1.lists_id = T2.lists_Id)
    ) = 3)

) AS Q1

INNER JOIN (


SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Joe') and
  (T1.Id=T2.Names_ID) and
  (
    (SELECT count(*) FROM
      listhasnames as Z1
     WHERE (Z1.Lists_id = T2.lists_id)
    ) = 3)

) AS Q2

ON (Q1.Lists_id=Q2.Lists_id)



INNER JOIN (


SELECT Lists_Id FROM
  names as T1,
  listhasnames as T2
WHERE
  (T1.Name='Jenny') and
  (T1.Id=T2.Names_ID) and
  (
    (SELECT count(*) FROM
      listhasnames as Z1
     WHERE (Z1.Lists_id = T2.lists_id)
    ) = 3)

) AS Q3

ON (Q1.Lists_id=Q3.Lists_id)

Looks a little bit complicated, uh? How to optimize that? I need only that Lists.Id in which specific names are in (and only these names and nobody else). Maybe with SELECT IN?

Regards, Dennis

+1  A: 

Updated:

select a.ListsId from
(
    --lists with three names only
    select lhn.ListsId, count(*) as count
    from ListHasNames  lhn
    inner join Names n on lhn.NamesId = n.Id 
    group by lhn.ListsId
    having count(*) = 3
) a
where a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Paul'))
and a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Joe'))
and a.ListsId in (select ListsId from ListHasNames lhn where NamesId = (select NamesId from names where Name = 'Jenny'))
RedFilter
Maybe you should tell him that Count = 3 is because of his request with 3 names
Scoregraphic
Since the list of names is fixed and "Count" will always be 3, I would remove the last two columns from the select list and do "HAVING COUNT(*) = 3" instead.
Tomalak
+1  A: 
SELECT ListsId
FROM ListHasNames a
WHERE NamesId in (1, 2, 3)
AND NOT EXISTS
(SELECT * from ListHasNames b 
WHERE b.ListsId = a.ListsId 
AND b.NamesId not in (1, 2, 3))
GROUP BY ListsId
HAVING COUNT(*) = 3;

Edit: Corrected thanks to Chris Gow's comment; the subselect is necessary to exclude lists that have other people on them. Edit 2 Corrected the table name thanks to Dennis' comment

Carl Manaster
This doesn't seem to work in postgreSQL. I tried it and I get both 4 and 5 back. The OP only wants list ids returned that contain only those three people and no one else
Chris Gow
You're right - it wouldn't work in any DBMS; I've added a subselect to correct the problem. Thanks.
Carl Manaster
(SELECT * from ListsId bshould read (SELECT * from ListHasNames bbut i got it anyway! ;-)
Thanks for the correction!
Carl Manaster
A: 

I was just solving a problem recently that may work well for your case as well. It may be overkill.

I took the approach of creating a list of candidate associations that may be the correct solution, and then using a cursor or queue table to go through the likely correct solutions to do full validation.

In my case this was implemented by doing like

select
ParentId
count(*) as ChildCount
checksum_agg(checksum(child.*) as ChildAggCrc
from parent join child on parent.parentId = child.parentId

Then you can compare the count and aggregate checksum against your lookup data (i.e. your 3 names to check for). If no rows match, you are guaranteed to have no matches. If any row matches you can then go through and do a join of that specific ParentId to validate if there are any discrepancies between the row sets.

Clear as mud? :)

+1  A: 

Using Carl Manaster's solution as a starting point I came up with:

SELECT listsid 
FROM listhasnames 
GROUP BY listsid HAVING COUNT(*) = 3
INTERSECT
SELECT x.listsid 
FROM listhasnames x, names n 
WHERE n.name IN('Paul', 'Joe', 'Jenny') 
AND n.id = x.namesid
Chris Gow
That's another nice way to exclude the lists with extra individuals.
Carl Manaster
Never heard of INTERSECT, looks like its new to MSSQL2005. However, doesn't this query actually return incorrect results? The first query returns all lists having 3 names and the second query returns all lists that contain either Paul, Joe, or Jenny. If there was a list that contained a total of 3 members but only 1 or 2 of those members is Paul, Joe, or Jenny, it would be returned by this query which is incorrect according to the original question.
David Archer
A: 

you are all great! thanks for your effort! chris, that's very nice, too! unfortunately my dbms (mysql) doesn't support intersect. Regards, Dennis