views:

82

answers:

3

I have four tables: Customer, CustomerCategory, Limit, and LimitCategory. A customer can be in multiple categories and a limit can also have multiple categories. I need to write a query that will return the customer name and limit amount where ALL the customers categories match ALL the limit categories.

I'm guessing it would be similar to the answer here, but I can't seem to get it right. Thanks!

Edit - Here's what the tables look like:

tblCustomer
  customerId
  name

tblCustomerCategory
  customerId
  categoryId

tblLimit
  limitId
  limit

tblLimitCategory
  limitId
  categoryId
A: 

I don't know if this will work or not, just a thought i had and i can't test it, I'm sures theres a nicer way! don't be too harsh :)

  SELECT 
   c.customerId
 , l.limitId
FROM 
 tblCustomer c
CROSS JOIN 
 tblLimit l
WHERE NOT EXISTS
(
 SELECT 
  lc.limitId 
 FROM 
  tblLimitCategory lc 
 WHERE 
  lc.limitId = l.id
 EXCEPT
 SELECT
  cc.categoryId 
 FROM 
  tblCustomerCategory cc 
 WHERE 
  cc.customerId = l.id
)
Paul Creasey
thanks, but I don't know what would go in the "ON" clause of the inner join. There is no link between customers and limits other than the categories. I updated the original question to include schema info.
adam0101
Edited it to a cross join, as i said, i can't test it, but it seems like it should work to me! Hope it doesn't need to be efficient though!
Paul Creasey
Why would customerId have to be greater than limitId? Can you explain this part? "c.customerId > l.limitId"
adam0101
ah sorry, that is an error, that is only needed when doing a self cross join to remove duplicates, edited!
Paul Creasey
It looks like it would work, but it does need to be efficient. The customer table is HUGE. Would it still work and be more efficient if I changed NOT EXISTS to EXISTS and EXCEPT to INTERSECT?
adam0101
This won't work either way. It returns too many records. I believe it cross joins each matching customer to each matching limit.
adam0101
Just play around with the where clause untill it works, i'm convinced it will!
Paul Creasey
No, there's another reason this won't work. If a limit has, for example, two of the three categories a customer has you'd get a false positive.
adam0101
query1 EXCEPT query2, will only NOT EXIST if query and query 2 return the exact same set of results, which will only happen if all categories match.
Paul Creasey
No, if query2 has 3 records and query1 matches 2 of those 3 records it will NOT EXIST because you are saying give me the 2 records from query1 EXCEPT where they exist in query2 which would produce zero records. The only way this would work is if you created an additional NOT EXISTS statement and switched the order of the queries. Even so, I don't think this would produce what you'd want because it would still cross join all the customers to all the limits. Each customer should only have one limit.
adam0101
A: 

I THINK you're looking for:

SELECT * 
FROM CustomerCategory 
LEFT OUTER JOIN Customer
    ON CustomerCategory.CustomerId = Customer.Id
INNER JOIN LimitCategory
    ON CustomerCategory.CategoryId = LimitCategory.CategoryId
LEFT OUTER JOIN Limit
    ON Limit.Id = LimitCategory.LimitId
md5sum
I believe this will return records where ANY customer categories match limit categories. I need ALL categories to match. Thanks anyway.
adam0101
Edited to select ALL CustomerCategories and leave nulls where customers or limits don't exist... is this what you want?
md5sum
No, I believe this will still return records if, for example, a limit has two of the three categories a customer has. I need the customer's group of categories to match the limit's group of categories exactly.
adam0101
A: 

This is what I ended up using. By matching on each individual category AND the counts of categories I am guaranteed that all categories match.

TESTED:

select distinct c.name, l.limit
from tblCustomer c
inner join tblCustomerCategory cc on c.customerId = cc.customerId
inner join (select customerId, count(distinct categoryId) as categories
            from tblCustomerCategory
            group by customerId) as customerCount
on c.customerId = customerCount.customerId

inner join tblLimit l
    inner join tblLimitCategory lc on l.limitId = lc.limitId
    inner join (select limitId, count(distinct categoryId) as categories
            from tblLimitCategory
            group by limitId) as limitCount
    on l.limitId = limitCount.limitId
on cc.categoryId = lc.categoryId
and customerCount.categories = limitCount.categories

I'm also testing the following solution which makes use of the new Over() feature to see if it's faster.

UNTESTED:

select distinct c.name, l.limit
from
    tblCustomer c,
    tblLimit l
where exists(
    select cc.categoryId, count(cc.categoryId) over(partition by cc.customerId)
    from tblCustomerCategory cc
    where cc.customerId = c.customerId

    intersect

    select l.categoryId, count(l.categoryId) over(partition by l.limitId)
    from tblLimitCategory lc
    where lc.limitId = l.limitId
)
adam0101