views:

75

answers:

3

SQL Server 2008 - I want to concatenate four columns into delimited values, but I want them to be ordered alphabetically. Is this possible?

*UPDATE:*More info... This will be used on approx 700k-1M rows per day in an ETL job via SSIS. If there is an easier way to do it within SSIS, please let me know (script task, etc). It could also be done within a Stored Proc.

Also keep in mind that these can be NULL - which is throwing some issues in with some of these solutions.

+1  A: 

This requirement might indicate a problem with your design. If the values in the 4 columns are semantically equivalent you will likely find putting it into first normal form and refactoring the repeating columns out into a new table will make this sort of problem easier.

A monster CASE statement will probably be much more efficient but here's one way without

WITH t AS
(
SELECT 1 as rowid, 'cat' as C1, 'apple' As C2, 
      'bear' AS C3, 'fox' AS C4 UNION ALL
SELECT 2 as rowid, 'B' as C1, 'D' As C2, 'E' AS C3, 'G' AS C4 
)

SELECT rowid,   STUFF((SELECT ',' + C FROM 
(
SELECT C1 AS C FROM t t2 WHERE  t.rowId = t2.rowId
UNION ALL
SELECT C2 AS C FROM t t2 WHERE  t.rowId = t2.rowId
UNION ALL
SELECT C3 AS C FROM t t2 WHERE  t.rowId = t2.rowId
UNION ALL
SELECT C4 AS C FROM t t2 WHERE  t.rowId = t2.rowId
) D
ORDER BY C
        FOR XML PATH('')),1,1,'') X

Gives

rowid       X
1           apple,bear,cat,fox
2           B,D,E,G
Martin Smith
This works, I had just found a solution similar to this on here. I am looking for the best performer so I'll have to try the other solutions and see. However this does answer the question (I will mark all correct items once I've tested some).
elgabito
+1  A: 

Complex, but this will work:

   Select Case 
         When a>b And a>c And a>d Then a + ',' +
             Case When b>c And b>d Then b + ',' +
                      Case When c>d Then c Else d End 
                  When c>b And c>d Then c + ',' +
                      Case When b>d Then b Else d End 
                  When d>b And d>c Then d + ',' +
                      Case When b>c Then b Else c End End +
         When b>a And b>c And b>d Then b + ',' +
             Case When a>c And a>d Then a + ',' +
                      Case When c>d Then c Else d End 
                  When c>a And c>d Then c + ',' +
                      Case When a>d Then a Else d End 
                  When d>a And d>c Then a + ',' +
                      Case When a>c Then a Else c End End + ',' +
          etc... 
    End

but I'd do this in code not in database...

EDIT: (as a computed column):

  Alter Table MyTable Add Column SortedABCD As
        Case 
             When a>b And a>c And a>d Then a + ',' +
                 Case When b>c And b>d Then b + ',' +
                          Case When c>d Then c Else d End 
                      When c>b And c>d Then c + ',' +
                          Case When b>d Then b Else d End 
                      When d>b And d>c Then d + ',' +
                          Case When b>c Then b Else c End End +
             When b>a And b>c And b>d Then b + ',' +
                 Case When a>c And a>d Then a + ',' +
                          Case When c>d Then c Else d End 
                      When c>a And c>d Then c + ',' +
                          Case When a>d Then a Else d End 
                      When d>a And d>c Then a + ',' +
                          Case When a>c Then a Else c End End + ',' +
              etc... 
        End
Charles Bretana
Although the hit might be not unreasonable when done in T-SQL if he used a persistent computed column?
Cruachan
yes, that's correct, then the only hit would be when inserting or updating the row... but I don't think this would be a significant performance hit.... It's just a maintenance issue cause it's so long and cumbersome.. .And what if you have to add another column ? or a sixth ?? with 4 columns you need 4 Factorial = 24 case statements. A fifth column would require 120, and a sixth column would not be possible.
Charles Bretana
@Cruachan - I am not familiar with computed columns so I looked them up - looks perfect, but does not seem to work. Might work with the case statement but it's so huge it's giving me a headache so I may come back to it monday. With the my current solution I got this error for computed column on create table statement: "Subqueries are not allowed in this context. Only scalar expressions are allowed."
elgabito
@Charles Bretana - It is 4 and only 4 due to standard industry regulations. Could it change? Yes. However very unlikely.
elgabito
@elgabito, sounds like you are still have the keyword `Select` in your computed column definition... Remove that... it should work...
Charles Bretana
Good call......
elgabito
Couldn't verify that this works - I was getting incorrect results.
elgabito
+3  A: 

Unpivot the columns into rows, then order the rows. Use whatever row string concatenation technique you favor, like FOR XML trick:

with cte as (
select *
from (values ('A' ,'C', 'B' ,'D')) as T (c1, c2, c3, c4))
select Value + ',' as [*]
from cte
unpivot (Value for c in (c1, c2, c3, c4)) as u
order by Value
for xml path('')
Remus Rusanu
@Remus: wow, that uses like 3 features I've never even heard of. nice!
Scott Stafford
+1: Nicely done!
OMG Ponies
Wasn't sure how to arrange this for my data structure - may work perfectly but I couldn't verify.
elgabito