views:

304

answers:

4

Note: These are not homework problems. I am studying dbms on my own, hence these homework-like questions.

Two tables :

Teachers (teacher_id, teacher_name)
Courses (teacher_id,course_id, course_name)

In order to select teacher names who are not teaching any courses, there are two queries I can think of :

mysql> explain select teacher_name from teachers left join courses on (teachers.
teacher_id = courses.teacher_id) where course_id is null;

mysql> explain select teacher_name from teachers where teacher_id not in (select
 teacher_id from courses);

Which one will be more efficient? Why?

A: 

I would recommend a 3rd option - and that is to add a distinct clause to your 2nd option and give that a go.

I would vote for the 2nd option as being more efficient (especially with the distinct clause added). You have hopefully added an index on teacher_id in both tables.

Edit:
I just noticed it's a trick question - there is no teacher_id field in Courses.

If more than one teacher can teach a course - add a table Teacher_Course (I always avoid plurals in object names - just a practice). In this new table store the TeacherCourseId (system generated id), course_id, and the teacher_id, and you can allow more than one teacher per class.

Jeff Olson
+2  A: 

A join will generally be faster than a subselect.

http://www.codersrevolution.com/index.cfm/2008/7/31/MySQL-performance-INNER-JOIN-vs-subselect

ceejayoz
A: 

I think that subqueries (second option in your case) could run faster because they limit the amount of rows returned and also the amount of columns returned. The first option of left outer join can be slower and more memory intensive. But again, it can depend on several other factors like the number of rows returned, indexes on columns etc.

CodeToGlory
A: 

Which one will be more efficient?

As always: it depends on what's in the tables, and what's being indexed.

Usually and by default, the join is preferable to the subquery. It looks at each teacher, then goes directly to the courses table to find a match for the teacher_id. Hopefully you will have created an index on ‘Courses.teacher_id’ and this will be a trivially simple lookup.

But if you don't, the subquery might be quicker. If you have a lot of courses taught by relatively few teachers, this would do the table scan once to generate a small temporary table that could then be faster to check against each row from the teachers table.

But in that case it'd probably be best to just add the index instead.

bobince