views:

56

answers:

4

I am using .NET 4.0 & C# and SQL Server 2008. I have a table Products with a column called Keywords. I have another table Interests with 2 columns Accept and Reject. All 3 columns contains keywords separated by commas. I need to match:

  1. If Accept & Reject are empty, then there is a match
  2. If Accept has keywords, then there should be at least 1 keyword match between Accept & Keywords
  3. If Reject has keywords, then there should be no matches between Reject & Keywords
  4. Combination of 2 & 3

Ideally I'd like to do this in a SQL query. I'm open to stored procedures either Managed or T-SQL in that order.


OK, based on the comments, I've added these tables:

Keywords_Products [id (FK to Products), Keyword]
Keywords_Accepted [id (FK to User), Keyword]
Keywords_Rejected [id (FK to User), Keyword]

So, I need a set of products for a given userId, based on the 4 rules above.

+1  A: 

The really problem is the fact that the lists of keywords are in one field. Breaking up a comma-separated list, is not really something SQL was designed to do.

These actually should be in 1-to-many tables (ProductKeywords, AcceptInterest, RejectInterests).

James Curran
+1  A: 

I would suggest you restructure your DB as follows from the limited information I have:

Product
---------
ProductId - int
ProductName - varchar
Accepted - bit

ProductKeywords
---------------
KeywordId - int
ProductId - int
Keyword - varchar

I am a bit confused about the structure of your data though. Would this model work for you? If so it will prevent a lot of maintainability and performance headaches in the future.

Abe Miessler
A: 

If you are stuck with the table structure as-is (and I suspect that you are), then you're probably best off breaking this up into multiple pieces. Your first piece is to create a function that can split strings based on a delimiter. How to do so has been answered well in another question.

It'd also help if you create a refinement on the split function by creating one specific to the Keywords field on the Products table. Something where you can pass in the ProductId and have it return the keywords for that ProductId.

From there, you can select interests matching a given product with something like

SELECT * FROM Interests i
WHERE 
    (i.Accept = '' AND i.Reject = '')
    OR
    (
        EXISTS (SELECT * FROM Split(',', i.Accept) ia WHERE ia.[Value] IN (ProductKeywords(@ProductId)))
        AND
        NOT EXISTS (SELECT * FROM Split(',', i.Reject) ir WHERE ir.[Value] IN (ProductKeywords(@ProductId)))
    )
Jacob Proffitt
A refinement I just thought of is if this is in a proc you can assign the result of ProductKeywords(@ProductId) to a table variable so the function is only called once during the select. The execution plan *may* optimize that for you, but it doesn't hurt to anticipate if you can manage it.
Jacob Proffitt
A: 

One Option for comma-delimited strings:

  • Table-value functions (a function that takes your comma-delimited string as input, and returns a table)

Join the results of your "Accept" table-value function to the results of your "Keywords" table-value function, and if rows are returned, you know you have a match.

A second option would be to create a function that pulls Word n out of one field, and then does a CHARINDEX(), etc., against the second field. Put this in a WHILE loop, and as soon as you get a CHARINDEX() return of greater than zero, exit the loop.

dave
BTW -- the more normalized table approach others have suggested is more-or-less a good ideas as well. Just wanted to add to it.
dave