tags:

views:

1521

answers:

5

I have two database tables, Categories and SuperCategories, for an inventory control system I'm working on:

Categories: ID_Category, CategoryName

SuperCategories: ID_SuperCategory, CategoryID, SuperCategoryID

I'm putting category-subcategory relationships into the SuperCategories table. I'm putting all categories into the Categories table.

Here is an example:

    Categories:
    ID_Category CategoryName
    1           Box
    2           Red Box
    3           Blue Box
    4           Blue Plastic Box
    5           Can
    6           Tin Can  
    
    SuperCategories:
    ID_Super CategoryID SuperCategoryID
    1        2          1
    2        3          1
    3        4          3
    4        6          5

CategoryID and SuperCategoryID relate back to the primary key ID_Category in the Categories table.

What I would like is a query that returns all of the category names that are not parents of any other categories:

Red Box
Blue Plastic Box
Tin Can

This amounts to finding all values of ID_Category that do not show up in the SuperCategoryID column (2, 4, and 6), but I'm having trouble writing the SQL.

I'm using VB6 to query an Access 2000 database.

Any help is appreciated. Thanks!

EDIT: I voted up everyone's answer that gave me something that worked. I accepted the answer that I felt was the most instructive. Thanks again for your help!

+2  A: 

Not sure if the syntax will work for Access, but something like this would work:

select CategoryName from Categories
where ID_Category not in (
    select SuperCategoryID 
    from SuperCategories 
)
Blorgbeard
This would work in Access AFAIK.
Cyril Gupta
too inefficient to use a subquery. Go directly with an outer join for more efficiency
Mike Pone
Thank you. This works!
John at CashCommons
There is no reason to apply "distinct" inside the subquery.
Bill Karwin
For something like a list of categories, with around 100 entries max (I assume), the two table scans required will take 99.99% of the cost of execution, and this is easier to read, IMO. Bill, I agree about distinct, though.. Will remove.
Blorgbeard
+3  A: 

include only those categories that don't are not super cateogories. A simple outer join

select CategoryName from Categories LEFT OUTER JOIN
SuperCategories ON Categories.ID_Category =SuperCategories.SuperCategoryID
WHERE SuperCategories.SuperCategoryID is  null
Mike Pone
Thanks! This works. Now I just have to figure out why!
John at CashCommons
Outer joins will take everything from the left side of the join (that is everything from Categories) and only those records that match from the SuperCategories table.
Mike Pone
+6  A: 
SELECT
     CAT.ID_Category,
     CAT.CategoryName
FROM
     Categories CAT
WHERE
     NOT EXISTS
     (
          SELECT
               *
          FROM
               SuperCategories SC
          WHERE
               SC.SuperCategoryID = CAT.ID_Category
     )

Or

SELECT
     CAT.ID_Category,
     CAT.CategoryName
FROM
     Categories CAT
LEFT OUTER JOIN SuperCategories SC ON
     SC.SuperCategoryID = CAT.ID_Category
WHERE
     SC.ID_Super IS NULL

I'll also make the suggestion that your naming standards could probably use some work. They seem all over the place and difficult to work with.

Tom H.
Thanks! This is my first venture into SQL so I have a lot to learn. I meant ID_ for primary key and BlahBlahID to indicate that it was a key in another table. Always open to suggestions...
John at CashCommons
Thank you. This works as well!
John at CashCommons
+3  A: 

Hi John W,

Mike Pone's answer works, because he joins the "Categories" table with the "SuperCategories" table as a "LEFT OUTER JOIN" - this will take all entries from "Categories" and add columns from "SuperCategories" to those where the link exists - where it does not exist (e.g. where there is no entry in "SuperCategories"), you'll get NULLs for the SuperCategories columns - and that's exactly what Mike's query then checks for.

If you would write the query like so:

SELECT c.CategoryName, s.ID_Super 
FROM Categories c 
LEFT OUTER JOIN SuperCategories s ON c.ID_Category = s.SuperCategoryID

you would get something like this:

CategoryName    ID_Super
Box               1
Box               2
Red Box           NULL
Blue Box          3
Blue Plastic Box  NULL
Can               4
Tin Can           NULL

So this basically gives you your answer - all the rows where the ID_Super on the LEFT OUTER JOIN is NULL are those who don't have any entries in the SuperCategories table. All clear? :-)

Marc

marc_s
Yes, that clears it up. Thank you!
John at CashCommons
A: 

I always take the outer join approach as marc_s suggests. There is a lot of power when using OUTER JOINS. Often times I'll have to do a FULL OUTER JOIN to check data on both sides of the query.

You should also look at the ISNULL function, if you are doing a query where data can be in either table A or table B then I will use the ISNULL function to return a value from either column.

Here's an example


 SELECT 
       isNull(a.[date_time],b.[date_time]) as [Time Stamp]
      ,isnull(a.[ip],b[ip]) as [Device Address]
      ,isnull(a.[total_messages],0) as [Local Messages]
      ,isnull(b.[total_messages],0) as [Remote Messages]
  FROM [Local_FW_Logs] a
FULL OUTER JOIN [Remote_FW_Logs] b 
on b.ip = a.ip
OhioDude