views:

44

answers:

1

I asked this question earlier, but I changed the table structure based on recommendations. So, here's the question with the new tables and some additional requirements. I have the following tables:

Products
ID BIGINT
Name VARCHAR(64)
Category1_ID INT
Category2_ID INT
Category3_ID INT

Categories
ID BIGINT Name VARCHAR(64)

Product_Keywords
PRODUCT_ID BIGINT (FK to Products)
Keyword VARCHAR(64)

User_Interests
ID BIGINT
USER_ID BIGINT (FK TO Users)
Category_ID INT

Accepted_Keywords
USER_INTEREST_ID BIGINT (FK to User_Interests)
Keyword VARCHAR(64)

Blocked_Keywords
USER_INTEREST_ID BIGINT (FK to User_Interests)
Keyword VARCHAR(64)

A product (Products table) can belong to 1 to 3 categories (at least one) - so Category1 is always specified, but Category2 and Category3 may or may not be null. A product can have 0 or more keywords (Product_Keywords).

A user can specify one or more categories of interest (User_Interests). Each category of interest can have zero or more accepted keywords (Accepted_Keywords). It can also have zero or more blocked keywords (Blocked_Keywords).

I would like to know the products of interest to a given user, based on categories matched in addition to the following rules:
1. If there are NO Accept or Block keywords, then category match is sufficient
2. If Accept has keywords, then there should be at least 1 keyword match between Accept & Product_Keywords
3. If Block has keywords, then there should be no matches between Block & Product_Keywords
4. Combination of 2 & 3

Ideally I'd like to do this in a SQL query. I'm open to stored procedures either Managed or T-SQL in that order. I'm using SQL Server 2008, C# & .NET 4.0.

Thanks!

+1  A: 
CREATE PROCEDURE GetProductsOfInterest
    @USER_ID BIGINT NOT NULL
AS
SET NOCOUNT ON

SELECT p.ID, p.Name
FROM   Products p
INNER JOIN User_Interests ui
ON     ui.USER_ID = @USER_ID
AND    (ui.Category_ID = p.Category1_ID OR
        ui.Category_ID = p.Category2_ID OR
        ui.Category_ID = p.Category3_ID)
WHERE  (NOT EXISTS (SELECT 1 FROM Product_Keywords pk1
            WHERE  pk1.PRODUCT_ID = p.ID
            AND    EXISTS (SELECT 1 FROM Blocked_Keywords bk1
                   WHERE  bk1.USER_INTEREST_ID = ui.ID
                   AND    bk1.Keyword = pk1.Keyword)) AND
    (NOT EXISTS (SELECT 1 FROM Accepted_Keywords ak1
             WHERE  ak1.USER_INTEREST_ID = ui.ID) OR
     EXISTS (SELECT 1 FROM Product_Keywords pk2
         WHERE  pk2.PRODUCT_ID = p.ID
         AND    EXISTS (SELECT 1 FROM Accepted_Keywords ak2
                    WHERE  ak2.USER_INTEREST_ID = ui.ID
                AND    ak2.Keyword = pk2.Keyword)))
GO

I can't promise this will work completely, given no good way to debug at the moment. This code assumes that if a keyword appears in both the Accepted and Blocked lists for a given interest, the Blocked list prevails. Also, the Accepted and Blocked lists apply only to one specific interest category for a customer. Finally, if an item is in more than one category of interest, it could show up once per category of interest, subject to the accepted/blocked rules for each category of interest.

Typically, I would look at an execution plan to see where improvements might be warranted as well.

Andrew
Nice one. This is the query I would have written, had not @Andrew beaten me to it.
Aaron D