views:

99

answers:

4

I have a two column view

Product Id   Tag
----------------------
1            Leather
1            Watch
2            Red
2            Necklace
2            Pearl

I'm trying to get all possible combinations of tags for a product as such:

1          Leather
1          Leather,Watch
2          Pearl
2          Pearl,Necklace
2          Pearl Necklace,Red
2          Necklace
2          Necklace, Red
2          Red

I've found and stolen some SQL that give me the complete list for all but not the small versions, its below.

Any ideas, it's started to make my head hurt. A virtual pint for the best answer.

SELECT ProductId, 
       (SELECT CAST(Tag + ', ' AS VARCHAR(MAX)) 
          FROM ProductByTagView 
         WHERE Product.ProductId = ProductByTagView.ProductId
      order by tag
       FOR XML PATH ('')) AS Tags
FROM Product
A: 

Something like this?:

select a.ProductID, a.tag+','+b.tag from aView a 
cross join aView b 
where a.tag != b.tag
union 
select ProductID, tag from aView
SqlACID
this does not return the correct results
adrift
+3  A: 

Here's one way.

In theory it can cope with up to 20 tags per Product (limited by the size of the numbers table) I didn't bother attempting that though. On my desktop it took about 30 seconds to churn out the 65,535 results for a single product with 16 tags. Hopefully your actual number of tags per product will be a lot less than that!

IF OBJECT_ID('tempdb..#Nums') IS NULL
BEGIN
CREATE TABLE #Nums
(
i int primary key
)

;WITH 
L0 AS (SELECT 1 AS c UNION ALL SELECT 1),       
L1 AS (SELECT 1 AS c FROM L0 A CROSS JOIN L0 B), 
L2 AS (SELECT 1 AS c FROM L1 A CROSS JOIN L1 B), 
L3 AS (SELECT 1 AS c FROM L2 A CROSS JOIN L2 B),
L4 AS (SELECT 1 AS c FROM L3 A CROSS JOIN L3 B), 
L5 AS (SELECT 1 AS c FROM L4 A CROSS JOIN L4 B), 
Nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS i FROM L5)

INSERT INTO #Nums
SELECT TOP 1048576 i FROM Nums;
END


;with ProductTags As
(
SELECT 1 ProductId,'Leather' AS Tag UNION ALL
SELECT 1, 'Watch' UNION ALL
SELECT 2, 'Red' UNION ALL
SELECT 2, 'Necklace' UNION ALL
SELECT 2, 'Pearl'
), NumberedTags AS
(
SELECT 
      ProductId,Tag,
       ROW_NUMBER() OVER (PARTITION BY ProductId ORDER BY Tag) rn,
       COUNT(*) OVER (PARTITION BY ProductId) cn
FROM ProductTags
),
GroupedTags As
(
SELECT ProductId,Tag,i
FROM NumberedTags
JOIN #Nums  on 
               #Nums.i <  POWER ( 2 ,cn)  
           and #Nums.i & POWER ( 2 ,rn-1) > 0
)
SELECT ProductId, 
       STUFF((SELECT CAST(', ' + Tag AS VARCHAR(MAX)) 
          FROM GroupedTags g2
         WHERE g1.ProductId = g2.ProductId and g1.i = g2.i
      ORDER BY Tag
       FOR XML PATH ('')),1,1,'') AS Tags
FROM GroupedTags g1
GROUP BY ProductId, i
ORDER BY ProductId, i
Martin Smith
Martin,Thanks for this but on my production DB (40,000 ish products up to 15 tags out of 50 per product) this took an hour to get through 415 products.
Alec Dobbie
As @Andomar commented above I think a mixture of C# (my chosen weapon) and SQL might well be the answer. Thanks a lot for this though, love this community.
Alec Dobbie
A: 

My canned answer:

Suppose you have a auxiliary Numbers table with integer numbers.

DECLARE @s VARCHAR(5);
SET @s = 'ABCDE';

WITH Subsets AS (
SELECT CAST(SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST('.'+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS Permutation,
CAST(1 AS INT) AS Iteration
FROM dbo.Numbers WHERE Number BETWEEN 1 AND 5
UNION ALL
SELECT CAST(Token+SUBSTRING(@s, Number, 1) AS VARCHAR(5)) AS Token,
CAST(Permutation+CAST(Number AS CHAR(1))+'.' AS VARCHAR(11)) AS
Permutation,
s.Iteration + 1 AS Iteration
FROM Subsets s JOIN dbo.Numbers n ON s.Permutation NOT LIKE
'%.'+CAST(Number AS CHAR(1))+'.%' AND s.Iteration < 5 AND Number
BETWEEN 1 AND 5
--AND s.Iteration = (SELECT MAX(Iteration) FROM Subsets)
)
SELECT * FROM Subsets
WHERE Iteration = 5
ORDER BY Permutation

Token Permutation Iteration
----- ----------- -----------
ABCDE .1.2.3.4.5. 5
ABCED .1.2.3.5.4. 5
ABDCE .1.2.4.3.5. 5
(snip)
EDBCA .5.4.2.3.1. 5
EDCAB .5.4.3.1.2. 5
EDCBA .5.4.3.2.1. 5
(120 row(s) affected)
AlexKuznetsov
A: 

As @Andomar suggested this turned out to be a problem best solved in a client language, I tried the various solutions (thanks guys) and although there seemed to be something there, especially in Martin's answer the problem was the time taken to run the query.

Thanks again guys.

Alec

Alec Dobbie