views:

392

answers:

1

Hi,

I have a query that I know can be done using a subselect, but due to large table sizes (100k+ rows per table) I would like to find an alternative using a join. This is not a homework question, but it's easier to share an example in such terms.

Suppose there are two tables:
Students
:id :name
1   Tom
2   Sally
3   Ben

Books
:id  :student_id  :book
1    1                  Math 101
2    1                  History
3    2                  NULL
4    3                  Math 101

I want to find all students who don't have a history book. Working subselect is:
select name from students where id not in (select student_id from books where book = 'History');

This returns Sally and Ben.

Thanks for your replies!

A: 

Is performance the problem? Or is this just some theoretical (homework?) question to avoid a subquery? If it's performance then this:

SELECT *
FROM studnets s
WHERE NOT EXISTS
(SELECT id FROM books WHERE student_id = s.id AND book = 'History')

will perform a lot better than the IN you're doing on MySQL (on some other databases, they will perform equivalently). This can also be rephrased as a join:

SELECT s.*
FROM studnets s
LEFT JOIN books b ON s.id = b.student_id AND b.book = 'History'
WHERE b.id IS NULL
cletus
Except that he's trying to get away from doing a subquery.
Michael Todd
Thanks for the reply cletus, it's working for me. The only modification I had to make from your suggested join was to change "JOIN" to "LEFT JOIN".
Sorry yes you're quite right. I'll fix it.
cletus