tags:

views:

89

answers:

6

Consider the following simplified example:

CREATE TABLE groups ( gid INTEGER PRIMARY KEY, name VARCHAR(100) );

CREATE TABLE people ( pid INTEGER PRIMARY KEY );

CREATE TABLE people_groups (
    gid INTEGER NOT NULL
        CONSTRAINT fk_people_groups_group
        REFERENCES groups(gid),
    pid INTEGER NOT NULL
        CONSTRAINT fk_people_groups_person
        REFERENCES people(pid),
    CONSTRAINT pk_people_groups PRIMARY KEY (gid, pid)
);

INSERT INTO people (pid) VALUES (1);
INSERT INTO groups (gid, name) VALUES (1, 'One');
INSERT INTO groups (gid, name) VALUES (2, 'Two');
INSERT INTO people_groups (gid, pid) VALUES (1,1);
INSERT INTO people_groups (gid, pid) VALUES (2,1);

SELECT gid, name FROM groups WHERE gid IN (
    SELECT gid FROM people_groups WHERE pid = 1
);

This outputs:

1|One
2|Two

What is the correct JOIN for that last SELECT?

+3  A: 
SELECT g.gid, g.name 
  FROM groups g INNER JOIN people_groups pg
       ON g.gid = pg.gid
 WHERE pg.pid = 1

Note: This is equivalent to your IN statement, because you specifically filter for only one person. If you filter for multiple persons, things are different. For example, assume that some person with pid=2 is also in group 1 and you do the following SELECT:

SELECT g.gid, g.name 
  FROM groups g INNER JOIN people_groups pg
       ON g.gid = pg.gid
 WHERE pg.pid IN (1, 2)

This would return group 1 twice (in contrast to your IN solution, which would return each group only once). To solve this, you need to add the DISTINCT keyword after the SELECT or add GROUP BY g.gid, g.name at the very end of the SQL. You should keep that in mind if you use this answer as a general rule to convert IN to JOIN.

Heinzi
Two people can't come up with such similar answers and both be wrong. :D
Jonathan Leffler
Thank you for the followup explanation.
Sinan Ünür
+1  A: 

How about:

SELECT g.gid, g.name
  FROM groups AS g JOIN people_groups AS p ON g.gid = p.gid
 WHERE p.pid = 1;
Jonathan Leffler
This makes intuitive sense to me but I am not very sure of my footing when it comes to SQL. Could you briefly explain why this is better than @Heinzi and @Oded's answers?
Sinan Ünür
@Sinan: It's equivalent. In SQL Server, the `AS` keyword in front of table aliases is optional, as well as the `INNER` keyword in front of the `JOIN`.
Heinzi
@Heinzi I see. Thank you very much.
Sinan Ünür
This answer is no better or worse than the others. All of the answers provided so far will work.
Cynthia
+3  A: 

This is the equivalent:

SELECT g.gid, g.name 
FROM groups g 
  INNER JOIN people_groups pg
  ON g.gid = pg.gid
WHERE pg.pid = 1
Oded
Or, even, three people can't come up with such similar answers and all be wrong. (See also comment to Heinzi's answer.)
Jonathan Leffler
A: 
SELECT g.gid, g.name FROM groups g, people_groups pg 
WHERE g.gid = pg.gid and pg.pid = 1
NimChimpsky
+2  A: 
SELECT gid, name FROM groups 
NATURAL JOIN people_groups 
WHERE pid = 1;
lollinus
+1  A: 

The answers given before are correct, but I'd like to note that you query will benefit from making the PRIMARY KEY on (pid, gid) (in this order).

This query:

SELECT  g.gid
FROM    people_groups pg
JOIN    groups g
ON      g.gid = pg.gid
WHERE   pd.pid = 1

will then be able to make people_groups be leading in the JOIN which most probably will make it much faster, since pid seems to be very selective in the link table.

Alternatively, you can create a secondary index:

CREATE UNIQUE INDEX ux_peoplegroup_p_g ON people_group (pid, gid)

, or just

CREATE INDEX ix_peoplegroup_p ON people_group (pid)

if people_groups is InnoDB.

Quassnoi
+1 Thank you for pointing out that the order of the columns in a composite primary key can matter. I did not know that at all.
Sinan Ünür
@Sinan: Actually, the `FOREIGN KEY` constraint (which implies `InnoDB`) will create the second index automatically. However, this second index will have `3` columns in the key part, not `2` as in the `PRIMARY KEY`, and hence will be a trifle less efficient *for this query*. If you were to query for `gid = 1` and join with people, this layout would be ideal.
Quassnoi
I am actually prototyping on SQLite right now. I don't know how it specifically behaves in this instance but it is good to know that the order can have an effect. Thank you.
Sinan Ünür