tags:

views:

148

answers:

7

Let's assume I have 3 tables in a MySQL table with the following fields.

product

product_id
product_name

category

category_id
category_name

product_in_category

product_in_category_id
product_id
category_id

What query would you recommend to get a list of all products that are not currently assigned to a category in the "product_in_category" table?

Performance is very important, these tables are huge.

+3  A: 

I would use left join.

If you look around you can find answers that include not in and not exists.

SELECT
  p.product_id, 
  product_name
FROM
   product p
   left join product_in_category pc
   on p.product_id = pc.product_id
WHERE
 pc.product_id is null
Conrad Frix
Your where clause is wrong.
Oded
+1 the left join / is null works great for this sort of thing for our DB, and the tables are millions of rows. However, you want IS NULL not IS NOT NULL
Nikki9696
I think you want IS NULL instead of IS NOT NULL.
Larry Lustig
whoops fixed it
Conrad Frix
+2  A: 
SELECT product_id
from products p
 where not exists (select 1
                   from product_in_category pc
                   where pc.product_id = p.product_id)
Michael Pakhantsov
I dunno about MySql, but in SqlServer this is the best option, since use Anti-semi-join
Fujiy
+1 for NOT EXISTS, which can perform better. See [here](http://sqlinthewild.co.za/index.php/2010/03/23/left-outer-join-vs-not-exists/).
Joe Stefanelli
+1  A: 
SELECT Product_name from Product
WHERE ProductID not in (SELECT product_id FROM product_in_category)
egrunin
in Oracle I would use MINUS. this one would be my second pick (+1)
Randy
+6  A: 
select a.* from product a
left join product_in_category b on a.product_id = b.product_id
where b.product_id is null
Fosco
+5  A: 

Using LEFT JOIN/IS NULL:

   SELECT p.*
     FROM PRODUCT p
LEFT JOIN PRODUCT_IN_CATEGORY pic ON pic.product_id = p.product_id
    WHERE pic.product_in_category_id IS NULL

Using NOT IN

SELECT p.*
  FROM PRODUCT p
 WHERE p.product_id NOT IN (SELECT pic.product_id
                              FROM PRODUCT_IN_CATEGORY pic)

Using NOT EXISTS

SELECT p.*
  FROM PRODUCT p
 WHERE NOT EXISTS (SELECT NULL
                     FROM PRODUCT_IN_CATEGORY pic
                    WHERE pic.product_id = p.product_id)

Which is best?

It depends on if the columns being compared are nullable (values can be NULL) or not. If they are nullable, then NOT IN/NOT EXISTS are more efficient. If the columns are not nullable, then LEFT JOIN/IS NULL is more efficient (MySQL only).

OMG Ponies
I already started to worry!
Quassnoi
+1 for thoroughness.
Fosco
A: 

Exclusive joins are always performance hogs. Here's a link that may help with mysql exclusive joins:

http://www.xaprb.com/blog/2006/05/26/how-to-write-full-outer-join-in-mysql/

HTH

code4life
+3  A: 

There are basically three methods to do that:

SELECT  p.*
FROM    product p
LEFT JOIN
        product_in_category pc
ON      pc.product_id = pc.product_id
WHERE   pc.product_id IS NULL

or

SELECT  p.*
FROM    product p
WHERE   product_id NOT IN
        (
        SELECT  product_id
        FROM    product_in_category
        )

or

SELECT  p.*
FROM    product p
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    product_in_category pc
        WHERE   pc.product_id = p.product_id
        )

As long as product_in_category.product_id is indexed, they all are nice, but LEFT JOIN and NOT IN are a little bit more efficient.

See this article for performance comparison:

Quassnoi
+1: Beat you by ~2 minutes! :p
OMG Ponies