tags:

views:

80

answers:

5

Hi Experts,

My situation is this. I have a table of products with a pk "Parent" which has "Components" The data looks something like this

Parent(PK)    Component
Car1          Wheel
Car1          Tyre
Car1          Roof
Car2          Alloy
Car2          Tyre 
Car2          Roof
Car3          Alloy
Car3          Tyre
Car3          Roof 
Car3          Leather Seats

Now what I want to do is some query that I can feed two codes in and see the differences... IE If I feed in "Car1", "Car2" it would return something like;

Parent       Component
Car1         Wheel
Car2         Alloy

As this is the difference between the two. If I said "Car1", "Car3" I would expect;

Parent       Component
Car1         Wheel
Car3         Alloy
Car3         Leather Seats

Your help with this matter would be greatly appreciated.

+2  A: 

I tried this and it returns the expected results:

Select Min(parent) As parent, component
From
(
    Select parent, component
    From products
    Where parent In ( 'Car1', 'Car3' )
)
Group By component
Having Count(*) = 1

The sub-query gets all the components of Car1 and Car2, and with the Group By and the Having-clause we remove those components that both cars have.

Peter Lang
I kind of dislike having to do a count ... it feels a little wrong
Sam Saffron
@Sam: Thought about that too, but in this case I find it easier to read and I would expect it to perform quite good (having an index on `parent`, or even better a compound index on `parent, component`). Would have to be profiled though.
Peter Lang
<me> temps you to profile both on SEDE </me> http://cloudexchange.cloudapp.net/stackoverflow/q/1978/text
Sam Saffron
+3  A: 
DECLARE @ThisCar .., @ThatCar;

SELECT @ThisCar = '...', @ThatCar = '...';

SELECT
    Parent, Component
FROM
    MyTable M1
WHERE
    M1.Parent = @ThisCar
    AND
    NOT EXISTS (SELECT *
        FROM
            MyTable M2
        WHERE
            M2.Parent = @ThatCar AND M1.Component = M2.Component)
UNION
SELECT
    Parent, Component
FROM
    MyTable M2
WHERE
    M2.Parent = @ThatCar 
    AND
    NOT EXISTS (SELECT *
        FROM
            MyTable M1
        WHERE
            M1.Parent = @ThisCar AND M1.Component = M2.Component):
gbn
Thank you. Some good answers here.
Pace
@Sam Saffron: rubbish: usually most efficient :-)
gbn
I declare double rubbish .... See my answer, its faster, want proof, profile it :)
Sam Saffron
@Sam Saffron: take out the union and add some proper indexing... who knows?
gbn
I tested with indexing, I get away with less index seeks ... still I hand it to you, your solution is fast.
Sam Saffron
You can have a look at the execution plans, which one looks simplest ? (see my amended answer)
Sam Saffron
+4  A: 

Without GROUP BY or UNION:

create table Products (
    Parent varchar(20) not null,
    Component varchar(20) not null
)
insert into Products (Parent,Component)
select 'Car1','Wheel' union all
select 'Car1','Tyre' union all
select 'Car1','Roof' union all
select 'Car2','Alloy' union all
select 'Car2','Tyre' union all
select 'Car2','Roof' union all
select 'Car3','Alloy' union all
select 'Car3','Tyre' union all
select 'Car3','Roof' union all
select 'Car3','Leather Seats'
go
select
    ISNULL (a.Parent,b.Parent) as Parent,
    ISNULL (a.Component,b.Component) as Component
from
    Products a
        full outer join
    Products b
        on
            a.Component = b.Component and
            a.Parent = 'Car1' and
            b.Parent = 'Car3'
where
    (a.Parent = 'Car1' and b.Parent is null) or
    (b.Parent = 'Car3' and a.Parent is null)
Damien_The_Unbeliever
Runnable version here: http://cloudexchange.cloudapp.net/stackoverflow/q/2082/text
Sam Saffron
note if you try this on a table with 10k rows, the query chokes and takes a real long time cause of the full join, see my answer for an efficient way
Sam Saffron
+1  A: 

An alternative

DECLARE @thisCar varchar(20) 
DECLARE @thatCar varchar(20) 

SET @thisCar = 'Car1'
SET @thatCar = 'Car2'


SELECT * FROM
(
    SELECT @thisCar AS Parent, Component FROM products WHERE parent = @thisCar
    EXCEPT
    SELECT @thisCar AS Parent, Component FROM products WHERE parent = @thatCar
) c1
UNION ALL
SELECT * FROM
(
    SELECT @thatCar AS Parent, Component FROM products WHERE parent = @thatCar
    EXCEPT
    SELECT @thatCar AS Parent, Component FROM products WHERE parent = @thisCar
    ) c2
Martin Smith
Perfect. Thanks :) Accepting this as there are less hits.
Pace
This is simple, however it is less efficient then both gbn and my answer
Sam Saffron
A: 

I profiled and by far the fastest solution to this is:

select
    ISNULL (a.Parent,b.Parent) as Parent,
    ISNULL (a.Component,b.Component) as Component
from
    (select * from Products where Parent = 'Car1') as a
        full outer join
    (select * from Products where Parent = 'Car2') as b
        on
            a.Component = b.Component 
where
    (a.Parent = 'Car1' and b.Parent is null) or
    (b.Parent = 'Car2' and a.Parent is null)

It runs about 30% faster than @gbns solution and many many many times faster than @Damiens solution

For the brave, here are some execution plans:

 
select
    ISNULL (a.Parent,b.Parent) as Parent,
    ISNULL (a.Component,b.Component) as Component
from
    (select * from Products where Parent = 'Car1') as a
        full outer join
    (select * from Products where Parent = 'Car2') as b


StmtText
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  |--Compute Scalar(DEFINE:([Expr1012]=isnull([Expr1004],[Expr1010]), [Expr1013]=isnull([Expr1005],[Expr1011])))
       |--Filter(WHERE:([Expr1004]='Car1' AND [Expr1010] IS NULL OR [Expr1010]='Car2' AND [Expr1004] IS NULL))
            |--Hash Match(Full Outer Join, HASH:([CloudDb].[dbo].[Products].[Component])=([CloudDb].[dbo].[Products].[Component]), RESIDUAL:([CloudDb].[dbo].[Products].[Component]=[CloudDb].[dbo].[Products].[Component]))
                 |--Compute Scalar(DEFINE:([Expr1004]=[CloudDb].[dbo].[Products].[Parent], [Expr1005]=[CloudDb].[dbo].[Products].[Component]))
                 |    |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000]))
                 |         |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts]), SEEK:([CloudDb].[dbo].[Products].[Parent]='Car1') ORDERED FORWARD)
                 |         |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products]), SEEK:([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)
                 |--Compute Scalar(DEFINE:([Expr1010]=[CloudDb].[dbo].[Products].[Parent], [Expr1011]=[CloudDb].[dbo].[Products].[Component]))
                      |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1006]))
                           |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts]), SEEK:([CloudDb].[dbo].[Products].[Parent]='Car2') ORDERED FORWARD)
                           |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products]), SEEK:([Bmk1006]=[Bmk1006]) LOOKUP ORDERED FORWARD)

StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SELECT
    Parent, Component
FROM
    Products M1
WHERE
    M1.Parent = 'Car1'
    AND
    NOT EXISTS (SELECT *
        FROM
            Products M2
        WHERE
            M2.Parent = 'Car2' AND M1.Component = M2.Component)
UNION ALL
SELEC

StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  |--Concatenation
       |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([M1].[Component]))
       |    |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000]))
       |    |    |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts] AS [M1]), SEEK:([M1].[Parent]='Car1') ORDERED FORWARD)
       |    |    |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products] AS [M1]), SEEK:([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)
       |    |--Top(TOP EXPRESSION:((1)))
       |         |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1003]))
       |              |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts] AS [M2]), SEEK:([M2].[Parent]='Car2') ORDERED FORWARD)
       |              |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products] AS [M2]), SEEK:([Bmk1003]=[Bmk1003]),  WHERE:([CloudDb].[dbo].[Products].[Component] as [M1].[Component]=[CloudDb].[dbo].[Products].[Component] as [M2].[Component]) LOOKUP ORDERED FORWA
       |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([M2].[Component]))
            |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1007]))
            |    |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts] AS [M2]), SEEK:([M2].[Parent]='Car2') ORDERED FORWARD)
            |    |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products] AS [M2]), SEEK:([Bmk1007]=[Bmk1007]) LOOKUP ORDERED FORWARD)
            |--Top(TOP EXPRESSION:((1)))
                 |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1010]))
                      |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts] AS [M1]), SEEK:([M1].[Parent]='Car1') ORDERED FORWARD)
                      |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products] AS [M1]), SEEK:([Bmk1010]=[Bmk1010]),  WHERE:([CloudDb].[dbo].[Products].[Component] as [M1].[Component]=[CloudDb].[dbo].[Products].[Component] as [M2].[Component]) LOOKUP ORDERED FORWA

StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SELECT * FROM
(
    SELECT 'Car1' AS Parent, Component FROM products WHERE parent = 'Car1'
    EXCEPT
    SELECT 'Car1'AS Parent, Component FROM products WHERE parent = 'Car2'
) c1
UNION ALL
SELECT * FROM
(
    SELECT 'Car2' AS Parent, Component 

StmtText
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  |--Concatenation
       |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([CloudDb].[dbo].[Products].[Component]))
       |    |--Sort(DISTINCT ORDER BY:([CloudDb].[dbo].[Products].[Component] ASC))
       |    |    |--Compute Scalar(DEFINE:([Expr1004]='Car1'))
       |    |         |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000]))
       |    |              |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts]), SEEK:([CloudDb].[dbo].[Products].[Parent]='Car1') ORDERED FORWARD)
       |    |              |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products]), SEEK:([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)
       |    |--Top(TOP EXPRESSION:((1)))
       |         |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1005]))
       |              |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts]), SEEK:([CloudDb].[dbo].[Products].[Parent]='Car2') ORDERED FORWARD)
       |              |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products]), SEEK:([Bmk1005]=[Bmk1005]),  WHERE:([CloudDb].[dbo].[Products].[Component]=[CloudDb].[dbo].[Products].[Component]) LOOKUP ORDERED FORWARD)
       |--Nested Loops(Left Anti Semi Join, OUTER REFERENCES:([CloudDb].[dbo].[Products].[Component]))
            |--Sort(DISTINCT ORDER BY:([CloudDb].[dbo].[Products].[Component] ASC))
            |    |--Compute Scalar(DEFINE:([Expr1014]='Car2'))
            |         |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1010]))
            |              |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts]), SEEK:([CloudDb].[dbo].[Products].[Parent]='Car1') ORDERED FORWARD)
            |              |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products]), SEEK:([Bmk1010]=[Bmk1010]) LOOKUP ORDERED FORWARD)
            |--Top(TOP EXPRESSION:((1)))
                 |--Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1015]))
                      |--Index Seek(OBJECT:([CloudDb].[dbo].[Products].[idxProducts]), SEEK:([CloudDb].[dbo].[Products].[Parent]='Car2') ORDERED FORWARD)
                      |--RID Lookup(OBJECT:([CloudDb].[dbo].[Products]), SEEK:([Bmk1015]=[Bmk1015]),  WHERE:([CloudDb].[dbo].[Products].[Component]=[CloudDb].[dbo].[Products].[Component]) LOOKUP ORDERED FORWARD)

A slightly more verbose query can, sometimes, result in a simpler execution plan.

Sam Saffron
note this would still need to be tested for very large sets components for 1 product, its possible that the hash match would struggle, i dunno havent tested
Sam Saffron