tags:

views:

94

answers:

3

I have the following database table with information about people, diseases, and drugs:

PERSON_T              DISEASE_T               DRUG_T
=========             ==========              ========
PERSON_ID             DISEASE_ID              DRUG_ID
GENDER                PERSON_ID               PERSON_ID
NAME                  DISEASE_START_DATE      DRUG_START_DATE
                      DISEASE_END_DATE        DRUG_END_DATE

I want to write a query which finds all people who had disease_id 52 but did not take drug 34. How do I do that? I tried the following in MySql:

SELECT p.person_id, p.gender, p.name, disease_id, drug_id 
   FROM person_t as p 
   INNER JOIN disease_t on disease_t.person_id = p.person_id 
   RIGHT OUTER JOIN drug_t on drug_t.person_id = p.person_id 
   WHERE disease_id= 52 AND drug_id != 34;

This gives me all of the records in which a person did not take drug_id 34 as opposed to the people who did not take drug_id 34. How would I go about writing the query I want?

+10  A: 

You can use NOT IN:

SELECT p.person_id, p.gender, p.name, disease_id
FROM person_t as p 
INNER JOIN disease_t d on disease_t.person_id = p.person_id 
WHERE disease_id = 52
AND p.person_id NOT IN (SELECT person_id IN drug_t WHERE drug_id = 34)
Mark Byers
+1, exactly what I was typing up!
KM
+1 Thank you. That works.
Jay Askren
+6  A: 

Depending on the optimiser NOT EXISTS may be more efficient than NOT IN. Try them both to see which one works best.

SELECT p.person_id, p.gender, p.name, disease_id, drug_id 
   FROM person_t as p 
   INNER JOIN disease_t on disease_t.person_id = p.person_id 
   WHERE disease_id= 52 AND NOT EXISTS (
       SELECT * from drug_T WHERE person_id = person_t.person_id AND drug_id = 34)
Adam Ruth
+1 for NOT EXISTS - This will almost always yield significantly better performance.
Tom
It would be OK to use NOT EXISTS in SQL Server but he appears to be using MySQL. According to this article: http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/ "That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT EXISTS." No harm in trying it anyway I guess... but probably it will be slower.
Mark Byers
Interesting, so the MySQL optimiser works the other way. MySQL is a strange animal sometimes. Thanks for the link.
Adam Ruth
@Adam Ruth: No, MySQL is just a strange bird.. an albatross, even...
OMG Ponies
I generally prefer to use "not exists" for future maintainability purposes: It's fairly obvious what you're up to. But when I've compared performance, I generally find that "left join / is null" is somewhat faster.
Jay
+7  A: 

For those who don't want to use a subquery:

   SELECT p.person_id, p.gender, p.name, disease_id
     FROM PERSON_T p 
     JOIN DISEASE_T d ON d.person_id = p.person_id 
LEFT JOIN DRUG_T dt ON dt.person_id = p.person_id
                   AND dt.drug_id = 34
    WHERE disease_id = 52
      AND dt.person_id IS NULL
OMG Ponies
+1 That would work on older pre-subquery MySQL versions.
Adam Ruth
@Adam Ruth: In MySQL, [LEFT JOIN/IS NULL is the most efficient/performant against columns that aren't nullable](http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/), but it's the [opposite for nullable columns](http://explainextended.com/2010/05/27/left-join-is-null-vs-not-in-vs-not-exists-nullable-columns/)
OMG Ponies
+1 Thank you, I thought it could be done with an outer join.
Jay Askren