tags:

views:

258

answers:

4

How do I do an inverse join with more than one key column?

In this baby-toy SqlServer example, I have the following

CREATE TABLE [dbo].[CarList](
 [myID] [int] IDENTITY(1,1) NOT NULL,
 [CarColour] [varchar](32) NOT NULL,
 [CarName] [varchar](128) NOT NULL,
 [CarCompany] [varchar](32) NOT NULL,

 CONSTRAINT [PK_CarList] PRIMARY KEY CLUSTERED(
  [myID] ASC,
  [CarColour] ASC,
  [CarName] ASC,
  [CarCompany] ASC
 )
)

GO

INSERT INTO CarList (CarColour, CarName, CarCompany)
VALUES('blue', 'Abe', 'Ford')

Elsewhere in the DB I have a table like

CREATE TABLE [dbo].[NewCars](
 [CarColour] [varchar](32) NOT NULL,
 [CarName] [varchar](128) NOT NULL,
 [CarCompany] [varchar](32) NOT NULL,
)

GO

INSERT INTO NewCars (CarColour, CarName, CarCompany)
SELECT 'blue', 'Abe', 'Ford'
 UNION ALL
SELECT 'blue', 'Abe', 'GM'
 UNION ALL
SELECT 'blue', 'Betty', 'Ford'
 UNION ALL
SELECT 'green', 'Abe', 'Honda'

Now I want to insert cars I don't already have in the CarList table

Something like...

INSERT INTO CarList ( CarColour, CarName, CarCompany)
 SELECT DISTINCT new.CarColour, new.CarName, new.CarCompany
 FROM    NewCars new, CarList old
 WHERE   new.CarColour  <> old.CarColour
     AND new.CarName    <> old.CarName
     AND new.CarCompany <> old.CarCompany

Which doesn't work because the "blue', 'Betty', 'Ford' row will get filtered out...

If this were just a single ID of some kind it would be really easy

INSERT INTO myTable (myID, param1, param2, etc)
SELECT param1, param2, etc
FROM someOtherTable new
WHERE new.myID NOT IN (SELECT myID FROM myTable)

For reasons I don't really want to get into, I cannot remove rows from NewCars that match CarList. I also need to do this in one pass if possible.

[edit] Thanks guys!

+5  A: 

thanks for DDL and DML

Here is one way

INSERT INTO CarList ( CarColour, CarName, CarCompany)
 SELECT DISTINCT *
 FROM    NewCars n
where not exists (select 1 from CarList c where c.CarColour =n.CarColour
and c.CarName = n.CarName
and c.CarCompany = n.CarCompany)

There are at least 4 different way to do this

  • NOT IN (will not work for more than 1 column like you have)
  • NOT EXISTS
  • LEFT and RIGHT JOIN
  • OUTER APPLY (2005+)
  • EXCEPT (2005+)

Read Select all rows from one table that don't exist in another table

SQLMenace
NOT IN will not work on multiple columns in SQL Server, will it?
Quassnoi
+1: very complete answer. Nice.
RBarryYoung
you would need a bunch of ANDs but it would be inefficient..
SQLMenace
@SQLMenace: how do you do it with AND's? NOT IN (SELECT col1 FROM table) AND NOT IN (SELECT col2 FROM table) will produce not the same result
Quassnoi
yep you are right, will change 5 to 4 :-)
SQLMenace
+5  A: 
INSERT
INTO    CarList ( CarColour, CarName, CarCompany)
SELECT  CarColour, CarName, CarCompany
FROM    NewCars nc
WHERE   NOT EXISTS
        (
        SELECT  1
        FROM    CarList cl
        WHERE   cl.CarColor = nc.CarColor
                AND cl.CarName  = nc.CarName
                AND cl.CarCompany = nc.CarCompany
        )
Quassnoi
hehe we think alike :-)
SQLMenace
It was a yesterday's post in my blog :)
Quassnoi
Let's upvote each other, it's a free +10 after all! :)
Quassnoi
In my answer I have a link that has 5 ways to do this which I blogged a while back
SQLMenace
And THERE's one of the flaws with the point system! Anyway, +1 for the really nice formatting.
RBarryYoung
@RBarryYoung: thanks for a +1 :) It was just a joke, I upvoted @SQLMenace even before, not to get an upvote back but because his answer is nice.
Quassnoi
+1 for pointing out the flaw hehe
SQLMenace
I can't tick you both, so I upvoted you instead. Thanks
Sukotto
+1  A: 
INSERT INTO CarList ( CarColour, CarName, CarCompany) 
SELECT DISTINCT new.CarColor, new.CarName, new.CarCompany 
FROM    NewCar new
where not exists (select 0 
      from  CarList old 
      WHERE   new.CarColour  = old.CarColour     
      AND new.CarName    = old.CarName     
      AND new.CarCompany = old.CarCompany)
--This statement matches all that does exists in carlist
--and insert everything that does not exists in Carlist
JuniorFlip
Man you guys are fast
JuniorFlip
Can only tick one answer so I upvoted yours instead. Thanks
Sukotto
+1  A: 

I would probably use:

INSERT INTO CarList(CarColour, CarName, CarCompany)
SELECT
     NC.CarColour,
     NC.CarName,
     NC.CarCompany
FROM
     NewCars NC
LEFT OUTER JOIN CarList CL ON
     CL.CarColour = NC.CarColour AND
     CL.CarName = NC.CarName AND
     CL.CarCompany = NC.CarCompany
WHERE
     CL.MyID IS NULL
Tom H.
I prefer this implementation. It's my (unproven) suspicion that older sql features are generally better optimized than newer sql features, and joins are about as old as you get. It's worth comparing the execution plans of a outer join implementation vs a not exists implementation, though, since you seem to have both available now.
quillbreaker