views:

61

answers:

3

In SQL Server 2005, I have an order details table with an order id and a product id. I want to write a sql statement that finds all orders that have all the items within a particular order. So, if order 5 has items 1, 2, and 3, I would want all other orders that also have 1, 2, and 3. Also, if order 5 had 2 twice and 3 once, I'd want all other orders with two 2s and a 3.

My preference is that it return orders that match exactly, but orders that are a superset are acceptable if that's much easier / performs much better.

I tried a self-join like the following, but that found orders with any of the items rather than all of the items.

SELECT * FROM Order O1
JOIN Order O2 ON (O1.ProductId = O2.ProductId)
WHERE O2.OrderId = 5

This also gave me duplicates if order 5 contained the same item twice.

+1  A: 

This sort of thing is very difficult to do in SQL, as SQL is designed to generate its result set by, at the most basic level, comparing a set of column values on a single row each to another value. What you're trying to do is compare a single column value (or set of column values) on multiple rows to another set of multiple rows.

In order to do this, you'll have to create some kind of order signature. Strictly speaking, this isn't possible to do using query syntax alone; you'll have to use some T-SQL.

declare @Orders table 
(
    idx int identity(1, 1), 
    OrderID int, 
    Signature varchar(MAX)
)
declare @Items table 
(
    idx int identity(1, 1), 
    ItemID int, 
    Quantity int
)

insert into @Orders (OrderID) select OrderID from [Order]

declare @i int
declare @cnt int

declare @j int
declare @cnt2 int

select @i = 0, @cnt = max(idx) from @Orders

while @i < @cnt
begin
    select @i = @i + 1

    declare @temp varchar(MAX)

    delete @Items

    insert into @Items (ItemID, Quantity)
    select 
        ItemID, 
        Count(ItemID) 

    from OrderItem oi    

    join @Orders o on o.idx = @i and o.OrderID = oi.OrderID

    group by oi.ItemID

    order by oi.ItemID

    select @j = min(idx) - 1, @cnt2 = max(idx) from @Items

    while @j < @cnt2
    begin
        select @j = @j + 1

        select @temp = isnull(@temp + ', ','') + 
            '(' + 
            convert(varchar,i.ItemID) + 
            ',' + 
            convert(varchar, i.Quantity) + 
            ')'
        from @Items i where idx = @j
    end

    update @Orders set Signature = @temp where idx = @i

    select @temp = null
end

select 
    o_other.OrderID 

from @Orders o

join @Orders o_other on 
        o_other.Signature = o.Signature
    and o_other.OrderID <> o.OrderID

where o.OrderID = @OrderID

This assumes (based on the wording of your question) that ordering multiple of the same item in an order will result in multiple rows, rather than using a Quantity column. If the latter is the case, just remove the group by from the @Items population query and replace Count(ItemID) with Quantity.

Adam Robinson
You made me feel better by leading off with "this sort of thing is very difficult to do in SQL." :)
Eddie Deyo
@Eddie: Happy to help ;) If this answer solved your problem, please be sure to upvote it and mark it as accepted. I noticed that your accept rate on questions is rather low (70%); you're much more likely to get helpful answers with a high accept rate, so I would sugggest accepting an answer to your older questions as well.
Adam Robinson
My accept rate is low because I've asked several questions that don't seem to have an acceptable solution. But I ALWAYS upvote good answers.
Eddie Deyo
@Eddie: Just remember, "it's not possible" always has to be an acceptable answer ;)
Adam Robinson
@Adam: LOL. Fair enough. :)
Eddie Deyo
+2  A: 

If the OrderDetails table contains a unique constraint on OrderId and ProductId, then you can do something like this:

Select ...
From Orders As O
Where Exists    (
                Select 1
                From OrderDetails As OD1
                Where OD1.ProductId In(1,2,3)
                    And OD1.OrderId = O.Id
                Group By OD1.OrderId
                Having Count(*) = 3
                )

If it is possible to have the same ProductId on the same Order multiple times, then you could change the Having clause to Count(Distinct ProductId) = 3

Now, given the above, if you want the situation where each order has the same signature with duplicate product entries, that is trickier. To do that you would need the signature of order in question over the products in question and then query for that signature:

With OrderSignatures As
    (
    Select O1.Id
        ,   (
            Select '|' + Cast(OD1.ProductId As varchar(10))
            From OrderDetails As OD1
            Where OD1.OrderId = O1.Id
            Order By OD1.ProductId
            For Xml Path('')
            ) As Signature
    From Orders As O1
    )
Select ...
From OrderSignatures As O
    Join OrderSignatures As O2
        On O2.Signature = O.Signature
            And O2.Id <> O.Id
Where O.Id = 5
Thomas
I think he's trying to match it based upon the items contained in another order ID, not based upon a static list of items (that would be easy enough to do with a single join per item).
Adam Robinson
@Adam Robinson - I see that. I've expanded my answer to include that type of request.
Thomas
+1. I hadn't thought of using a CTE and XML to create the signature. Much more elegant, though perhaps a little harder to understand.
Adam Robinson
Very, very cool.
Eddie Deyo
+1  A: 

I think this should work. I'm using 108 as an example OrderID, so you'll have to replace that twice below or use a variable.

WITH TempProducts(ProductID) AS
(
   SELECT DISTINCT ProductID FROM CompMarket
   WHERE OrderID = 108
)
SELECT OrderID  FROM CompMarket 
WHERE ProductID IN (SELECT ProductID FROM TempProducts) 
AND OrderID != 108
GROUP BY OrderID
HAVING COUNT(DISTINCT ProductID) >= (SELECT COUNT(ProductID) FROM TempProducts)

This uses a CTE to get a list of an Order's Products, then selects all order IDs that have products that are all in this list. To make sure that the Orders returned have all the products, this compares the Count of the CTE to the Counts of the returned Order's Products.

Nick
Note that this won't work for orders with multiple of the same item. For instance, if order `1` has two item `10` and one item `20`, while order `2` has one item `10` and two item `20`, this code will result in a match.
Adam Robinson
Wouldn't that find any order that has only some of the same items as long as the counts matched? So, if order 108 had 3 items, this would find any 3 (or more) item order with at least one item from 108?
Eddie Deyo
@Adam I added DISTINCT to COUNT(DISTINCT...), which will solve that problem. @Eddie, no: it is only comparing the COUNT for the items which are IN, not all of that Order's Products.
Nick