views:

423

answers:

1

I am looking for the most performant way to turn rows into columns. I have a requirement to output the contents of the db (not actual schema below, but concept is similar) in both fixed width and delimited formats. The below FOR XML PATH query gives me the result I want, but when dealing with anything other than small amounts of data, can take awhile.

 select orderid
   ,REPLACE((  SELECT '  ' + CAST(ProductId as varchar)
       FROM _details d
       WHERE d.OrderId = o.OrderId
       ORDER BY d.OrderId,d.DetailId
       FOR XML PATH('')
   ),' ','') as Products
 from _orders o

I've looked at pivot but most of the examples I have found are aggregating information. I just want to combine the child rows and tack them onto the parent.

I should also point out I don't need to deal with the column names either since the output of the child rows will either be a fixed width string or a delimited string.

For example, given the following tables:

OrderId     CustomerId
----------- -----------
1           1
2           2
3           3

DetailId    OrderId     ProductId
----------- ----------- -----------
1           1           100
2           1           158
3           1           234
4           2           125
5           3           101
6           3           105
7           3           212
8           3           250

for an order I need to output:

orderid     Products
----------- -----------------------
1             100  158  234
2             125
3             101  105  212  250

or

orderid     Products
----------- -----------------------
1           100|158|234
2           125
3           101|105|212|250

Thoughts or suggestions? I am using SQL Server 2k5.

Example Setup:

   create table _orders (
  OrderId int identity(1,1) primary key nonclustered
  ,CustomerId int
 )

 create table _details (
  DetailId int identity(1,1) primary key nonclustered
  ,OrderId int 
  ,ProductId int
 )

 insert into _orders (CustomerId)
 select 1
 union select 2
 union select 3

 insert into _details (OrderId,ProductId)
 select 1,100
 union select 1,158
 union select 1,234
 union select 2,125
 union select 3,105
 union select 3,101
 union select 3,212
 union select 3,250

 CREATE CLUSTERED INDEX IX_CL__orders on _orders(OrderId)
 CREATE NONCLUSTERED INDEX IX_NCL__orders on _orders(OrderId)
 INCLUDE (CustomerId)

 CREATE CLUSTERED INDEX IX_CL_details on _details(OrderId)
 CREATE NONCLUSTERED INDEX IX_NCL_details on _details(OrderId)
 INCLUDE (DetailId,ProductId)

using FOR XML PATH:

 select orderid
   ,REPLACE((  SELECT '  ' + CAST(ProductId as varchar)
       FROM _details d
       WHERE d.OrderId = o.OrderId
       ORDER BY d.OrderId,d.DetailId
       FOR XML PATH('')
   ),' ','') as Products
 from _orders o

which outputs what I want, however is very slow for large amounts of data. One of the child tables is over 2 million rows, pushing the processing time out to ~ 4 hours.

orderid     Products
----------- -----------------------
1             100  158  234
2             125
3             101  105  212  250
A: 

By definition a PIVOT is going to HAVE to aggregate in some way, because you can have multiple rows with the same pivot key columns. If you don't have multiple rows, that's fine - but you still need to choose an aggregate operator (MIN, MAX, SUM).

But the FOR XML PATH construct is better for the multiple-row-values to single-string-column "pivot" operation.

I'm not sure why yours is not performing well. What indexes do you have on the tables? What does your execution plan look like?

Cade Roux
There is a clustered index on the primary key column, and a nonclustered index on the primary key column, with includes for every other column in the table
ajberry
@ajberry - on the details table do you have a key on OrderId, DetailId which includes ProductId?
Cade Roux
@cade - I added the indexes that would normally be on the table under the Example setup above. I need to recheck the indexes on the actual tables, there may some tuning there.
ajberry
@ajberry - what does the execution plan look like and you might want to move the DetailId into that non-clustered index instead of being included (since you have that order by). Also, your clustered index is not unique on details (I would either cluster on detailId or orderid, detailid), so it will add a uniquifying data to the clustered index.
Cade Roux
@cade - the sp that was running this also ran a few indexing routines, which turned out to be the ones causing the slow down. I changed the indexes on the table as suggested above and did see an improvement. thanks!
ajberry