views:

62

answers:

5

Maybe I demand too much from SQL but I feel like this should be possible. I start with a list of key-value pairs, like this:

'0:First, 1:Second, 2:Third, 3:Fourth'

etc. I can split this up pretty easily with a two-step parse that gets me a table like:

EntryNumber  PairNumber  Item
0            0           0
1            0           First
2            1           1
3            1           Second

etc.

Now, in the simple case of splitting the pairs into a pair of columns, it's fairly easy. I'm interested in the more advanced case where I might have multiple values per entry, like:

'0:First:Fishing, 1:Second:Camping, 2:Third:Hiking' 

and such.

In that generic case, I'd like to find a way to take my 3-column result table and somehow pivot it to have one row per entry and one column per value-part.

So I want to turn this:

EntryNumber  PairNumber  Item
0            0           0
1            0           First
2            0           Fishing
3            1           1
4            1           Second
5            1           Camping

Into this:

Entry   [1]   [2]      [3]
0       0     First    Fishing
1       1     Second   Camping

Is that just too much for SQL to handle, or is there a way? Pivots (even tricky dynamic pivots) seem like an answer, but I can't figure how to get that to work.

A: 

Use:

  SELECT x.pairnumber,
         MAX(CASE WHEN x.rank = 1 THEN x.item ELSE NULL END) AS "[1]",
         MAX(CASE WHEN x.rank = 2 THEN x.item ELSE NULL END) AS "[2]",
         MAX(CASE WHEN x.rank = 3 THEN x.item ELSE NULL END) AS "[3]"
    FROM (SELECT t.pairnumber,
                 t.item,
                 ROW_NUMBER() OVER (PARTITION BY t.pairnumber ORDER BY t.entrynumber) AS rank
            FROM TABLE t) x
GROUP BY x.pairnumber

The problem is your entrynumber appears to be an identity field, complicating the pivot.

OMG Ponies
A: 

No, in SQL you can't infer columns dynamically based on the data found during the same query.

Even using the PIVOT feature in Microsoft SQL Server, you must know the columns when you write the query, and you have to hard-code them.

You have to do a lot of work to avoid storing the data in a relational normal form.

Bill Karwin
A: 

Alright, I found a way to accomplish what I was after. Strap in, this is going to get bumpy.

So the basic problem is to take a string with two kinds of delimiters: entries and values. Each entry represents a set of values, and I wanted to turn the string into a table with one column for each value per entry. I tried to make this a UDF, but the necessity for a temporary table and dynamic SQL meant it had to be a stored procedure.

CREATE PROCEDURE [dbo].[ParseValueList] 
(   
    @parseString varchar(8000),
    @itemDelimiter CHAR(1),
    @valueDelimiter CHAR(1)
)
AS
BEGIN

SET NOCOUNT ON;

    IF object_id('tempdb..#ParsedValues') IS NOT NULL
    BEGIN
       DROP TABLE #ParsedValues
    END
    CREATE TABLE #ParsedValues 
   ( 
        EntryID int,
       [Rank] int, 
       Pair varchar(200)
   )

So that's just basic set up, establishing the temp table to hold my intermediate results.

;WITH
    E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
        SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
        SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),--Brute forces 10 rows
    E2(N) AS (SELECT 1 FROM E1 a, E1 b),   --Uses a cross join to generate 100 rows (10 * 10)
    E4(N) AS (SELECT 1 FROM E2 a, E2 b),   --Uses a cross join to generate 10,000 rows (100 * 100)
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E4)

That beautiful piece of SQL comes from SQL Server Central's Forums and is credited to "a guru." It's a great little 10,000 line tally table perfect for string splitting.

INSERT INTO #ParsedValues
    SELECT ItemNumber AS EntryID, ROW_NUMBER() OVER (PARTITION BY ItemNumber ORDER BY ItemNumber) AS [Rank],  
        SUBSTRING(Items.Item, T1.N, CHARINDEX(@valueDelimiter, Items.Item + @valueDelimiter, T1.N) - T1.N) AS [Value]
    FROM(
        SELECT ROW_NUMBER() OVER (ORDER BY T2.N) AS ItemNumber,
            SUBSTRING(@parseString, T2.N, CHARINDEX(@itemDelimiter, @parseString + @itemDelimiter, T2.N) - T2.N) AS Item
        FROM cteTally T2
        WHERE T2.N < LEN(@parseString) + 2 --Ensures we cut out once the entire string is done
            AND SUBSTRING(@itemDelimiter + @parseString, T2.N, 1) = @itemDelimiter
        ) AS Items, cteTally T1
    WHERE T1.N < LEN(@parseString) + 2 --Ensures we cut out once the entire string is done
        AND SUBSTRING(@valueDelimiter + Items.Item, T1.N, 1) = @valueDelimiter

Ok, this is the first really dense meaty part. The inner select is breaking up my string along the item delimiter (the comma), using the guru's string splitting method. Then that table is passed up to the outer select which does the same thing, but this time using the value delimiter (the colon) to each row. The inner RowNumber (EntryID) and the outer RowNumber over Partition (Rank) are key to the pivot. EntryID show which Item the values belong to, and Rank shows the ordinal of the values.

    DECLARE @columns varchar(200)
    DECLARE @columnNames varchar(2000)
    DECLARE @query varchar(8000)

    SELECT @columns = COALESCE(@columns + ',[' + CAST([Rank] AS varchar) + ']', '[' + CAST([Rank] AS varchar)+ ']'),
    @columnNames = COALESCE(@columnNames + ',[' + CAST([Rank] AS varchar) + '] AS Value' + CAST([Rank] AS varchar)
                            , '[' + CAST([Rank] AS varchar)+ '] AS Value' + CAST([Rank] AS varchar))
    FROM (SELECT DISTINCT [Rank] FROM #ParsedValues) AS Ranks

    SET @query = '
    SELECT '+ @columnNames +'
    FROM #ParsedValues
    PIVOT 
    (
        MAX([Value]) FOR [Rank]
        IN (' + @columns + ')
    ) AS pvt'

    EXECUTE(@query)

    DROP TABLE #ParsedValues

END

And at last, the dynamic sql that makes it possible. By getting a list of Distinct Ranks, we set up our column list. This is then written into the dynamic pivot which tilts the values over and slots each value into the proper column, each with a generic "Value#" heading.

Thus by calling EXEC ParseValueList with a properly formatted string of values, we can break it up into a table to feed into our purposes! It works (but is probably overkill) for simple key:value pairs, and scales up to a fair number of columns (About 50 at most, I think, but that'd be really silly.)

Anyway, hope that helps anyone having a similar issue.

(Yeah, it probably could have been done in something like SQLCLR as well, but I find a great joy in solving problems with pure SQL.)

CodexArcanum
A: 

Though probably not optimal, here's a more condensed solution.

DECLARE @DATA varchar(max);
SET @DATA = '0:First:Fishing, 1:Second:Camping, 2:Third:Hiking';

SELECT
        DENSE_RANK() OVER (ORDER BY [Data].[row]) AS [Entry]
      , [Data].[row].value('(./B/text())[1]', 'int') as "[1]"
      , [Data].[row].value('(./B/text())[2]', 'varchar(64)') as "[2]"
      , [Data].[row].value('(./B/text())[3]', 'varchar(64)') as "[3]"
FROM
    (
        SELECT
            CONVERT(XML, '<A><B>' + REPLACE(REPLACE(@DATA , ',', '</B></A><A><B>'), ':', '</B><B>') + '</B></A>').query('.')
     ) AS [T]([c])
CROSS APPLY [T].[c].nodes('/A') AS [Data]([row]);
etliens
I'd be remiss if I didn't mention that I borrowed concepts from Brad Schulz and Adam Machanic for this solution.
etliens
That's a pretty nifty concept, using XML conversion to handle the translation. I had considered something like that. I may have to go over your suggestion more carefully to work out all that's going on. It doesn't quite do what I was hoping though, which is account for any number of values. It does present an easily extensible query for doing a known number of values though.
CodexArcanum
A: 

Hope is not too late.

You can use the function RANK to know the position of each Item per PairNumber. And then use Pivot

SELECT PairNumber, [1] ,[2] ,[3] 
FROM
(
SELECT  PairNumber, Item, RANK() OVER (PARTITION BY PairNumber order by EntryNumber) as RANKing
from tabla) T
PIVOT 
(MAX(Item)
FOR RANKing in ([1],[2],[3])
)as PVT
Claudia