ansaurus

Question

T-SQL Grouping rows from the MAX length columns in different rows (?)

Answer 1

+2 A:

SELECT A.akey, 
 (
  SELECT TOP 1 T1.text1
  FROM test1 T1
  WHERE T1.akey=A.akey AND LEN(T1.TEXT1) = MAX(LEN(A.text1))
 ) AS TEXT1,
 (
  SELECT TOP 1 T2.text2
  FROM test1 T2
  WHERE T2.akey=A.akey AND LEN(T2.TEXT2) = MAX(LEN(A.text2))
 ) AS TEXT2,
 (
  SELECT TOP 1 T3.text3
  FROM test1 T3
  WHERE T3.akey=A.akey AND LEN(T3.TEXT3) = MAX(LEN(A.text3))
 ) AS TEXT3
FROM TEST1 AS A
GROUP BY A.akey

I just realized you said you have 32 columns. I don't see a good way to do that, unless UNPIVOT would allow you to create separate rows (akey, textn) for each text* column.

Edit: I may not have a chance to finish this today, but UNPIVOT looks useful:

;
WITH COLUMNS AS
(
 SELECT akey, [Column], ColumnValue
 FROM
  (
   SELECT X.Akey, X.Text1, X.Text2, X.Text3
   FROM test1 X
  ) AS p
 UNPIVOT (ColumnValue FOR [Column] IN (Text1, Text2, Text3))
 AS UNPVT
)
SELECT *
FROM COLUMNS
ORDER BY akey,[Column], LEN(ColumnValue)

John Saunders 2009-06-26 10:42:26

that unpivot does look prommising, i havent used pivot/unpivot much. i'm assuming we then need to PIVOT the data back once we've selected the longest strings?

Ben O 2009-06-26 12:19:46

Yes, I think you'd need PIVOT.

John Saunders 2009-06-26 12:58:30

Answer 2

+1 A:

This seems really ugly, but at least works (on SQL2K) and doesn't need subqueries:

select test1.akey, A.text1, B.text2, C.text3
from test1
inner join test1 A on A.akey = test1.akey 
inner join test1 B on B.akey = test1.akey 
inner join test1 C on C.akey = test1.akey 
group by test1.akey, A.text1, B.text2, C.text3
having len(a.text1) = max(len(test1.text1))
   and len(B.text2) = max(len(test1.text2))
   and len(C.text3) = max(len(test1.text3))
order by test1.akey

I must admit that it needs an inner join for each column and I wonder how this could impact on the 32 columns x 13millions record table... I try both this approach and the one based one subqueries and looked at executions plans: I'ld actually be curious to know

Turro 2009-06-26 10:43:00

thanks, this works great for the example table but i'm reluctant to do ~32 table scans on 13m rows.

Ben O 2009-06-26 10:57:19

.. i'll try both the methods to see which runs better, for both our curiosity.

Ben O 2009-06-26 11:16:50

results are the subquery method by john is masively more efficient according to the execution plans.

Ben O 2009-06-26 12:12:43

Well, I'm not astonished, but I don't have such a massive data to test, so I was curios to know: thanks for taking the time to compare them and share your result (an upvote for your patience)

Turro 2009-06-26 12:30:03

ansaurus

tags:

views:

answers:

T-SQL Grouping rows from the MAX length columns in different rows (?)

related questions