views:

611

answers:

3

DB: SQL Server 2005

We have a table that has data in this manner:

Project              Year        Jan                   Feb                   Mar                   Apr                   May                   Jun                   Jul                   Aug                   Sep                   Oct                   Nov                   Dec
-------------------- ----------- --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- ---------------------
11-11079             2008        0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  75244.90
11-11079             2009        466.00                0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00
11-11079             2010        855.00                0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00                  0.00  
01-11052             2009        56131.00              0.00                  36962.00              -61596.00             2428.00               84.00                 0.00                  0.00                  0.00                  0.00                  0.00                  0.00

Someone would like the data to be displayed as one row for the entire project. The columns would be dynamic dependant on how many years it goes into the future. An example would be:

Project        Jan-2009     Feb-2009     Mar-2009     Apr-2009... Dec-2009     Jan-2010
-------------- ------------ ------------ ------------ ----------- ------------ ---------
11-11079       466.00       0.00         0.00         0.00        0.00         855.00    
01-11052       56131.00     0.00         36962.00     -61596.00   2428.00      0.00

I read of many examples where the date is populated in one column for each entry but I haven't found any cases where the months are the column name and the year is in the row.

Dynamic SQL with a pivot table?
Or some pretty wide scale manipulation using SQL, temp tables, joins and unions?
Any thoughts on using the SSIS pivot table feature?

A: 

I think you could do it with a nested while loop and some dynamic SQL. This would be a slow solution if you cannot save the final table or if you have to regenerate all the columns every month. However, if it is just additive then it might not be bad. Anyways, this is how I would do it:

  1. Out loop chooses the oldest year.
  2. Inner loop chooses the first month.
  3. Inside Inner loop - Add a column with the name of - to your table.
  4. Inside Inner loop - Update table with all of the information for the new - column with dynamic SQL
  5. Iterate through the inner loop for each month
  6. Iterate through the outer loop for each year.
RandomBen
-1. I'm sorry, RandomBen, but a loop just isn't supportable at all, when clear and easy set-based solutions are available for unpivoting and then pivoting again. Performance with loops will be absolutely terrible.
Emtucifor
A: 

Your data is already pivoted, but needs to be pivoted at a different level. I think the best way to handle this is to unpivot it first, then handle the correct pivot level second.

Step 1: Unpivot

You can use the SQL 2005 UNPIVOT command, or use a CROSS JOIN technique (I find the CROSS JOIN syntax easier to remember). Here are examples of both. Note I left out months in the middle to keep things simple. Just add them in.

-- CROSS JOIN method (also works in SQL 2000)
SELECT
   P.Project,
   Mo =
      DateAdd(mm,
         X.MonthNum,
         DateAdd(yy, P.[Year] - 1900, '19000101')
      ),
   Amount = 
      CASE X.MonthNum
         WHEN 0 THEN Jan
         WHEN 1 THEN Feb
         WHEN 11 THEN Dec
      END
FROM
   ProjectData P
   CROSS JOIN (
      SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 11
   ) X (MonthNum)

Each row is repeated 12 times, then a CASE statement pulls out only one month for each row, leaving the data nicely unpivoted.

-- UNPIVOT method
SELECT
    P.Project,
    Mo =
       DateAdd(mm,
          Convert(int, P.MonthNum),
          DateAdd(yy, P.[Year] - 1900, '19000101')
       ),
    P.Amount
FROM
   (
      SELECT Project, [Year], [0] = Jan, [1] = Feb, [11] = Dec
      FROM ProjectData
   ) X UNPIVOT (Amount FOR MonthNum IN ([0], [1], [11])) P

DROP TABLE ProjectData

Neither method is a clear performance winner all the time. Sometimes one works better than the other (depending on the data being pivoted). The UNPIVOT method uses a Filter in the execution plan that the CROSS JOIN does not.

Step 2: Pivot Again

Now, how to use the unpivoted data. You didn't say how your someone will be consuming this, but since you'll need to put the data in an output file of some kind, I propose using SSRS (Sql Server Reporting Services), which comes with SQL Server 2005 for no extra charge.

Just use the Matrix report object to pivot one of the queries above. This object happily determines the data values to make into column labels at report run-time and sounds like exactly what you need. If you add a column that formats the date exactly how you like, then you can order by the Mo column, but use the new expression as the column label.

SSRS also has a wide variety of formats and scheduling options available. For example, you can have it email an Excel file or save a web page to a file share.

Please let me know if I've left anything out.

For anyone who would like to see the code above in action, here's some creation script for you:

USE tempdb

CREATE TABLE ProjectData (
    Project varchar(10),
    [Year] int,
    Jan decimal(15, 2),
    Feb decimal(15, 2),
    Dec decimal(15, 2)
)

SET NOCOUNT ON

INSERT ProjectData VALUES ('11-11079', 2008, 0.0, 0.0, 75244.90)
INSERT ProjectData VALUES ('11-11079', 2009, 466.0, 0.0, 0.0)
INSERT ProjectData VALUES ('11-11079', 2010, 855.0, 0.0, 0.0)
INSERT ProjectData VALUES ('01-11052', 2009, 56131.0, 0.0, 0.0)
Emtucifor
Could I get a comment on what in my post deserved a downvote, please, and how my post is not useful?
Emtucifor
after posting my original question, i spent an afternoon staring at the data and it jumped out at me that the data is already pivoted. The moment I ran a quick unpivot on it, the light went on and I was able to build it the way I wanted. So your technique is the way I ended up going!
Reggie Dunlop
A: 

I wrote a stored proc named pivot_query that can help out with this, source is here, examples with raw data here.

With your data:

create table ProjectData
   (
   Project                      varchar(20),
   [Year]                       Integer,
   Jan                          decimal(12,2),
   Feb                          decimal(12,2),
   Mar                          decimal(12,2),
   Apr                          decimal(12,2),
   May                          decimal(12,2),
   Jun                          decimal(12,2),
   Jul                          decimal(12,2),
   Aug                          decimal(12,2),
   Sep                          decimal(12,2),
   Oct                          decimal(12,2),
   Nov                          decimal(12,2),
   Dec                          decimal(12,2)
   );

insert into ProjectData values ('11-11079',2008, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 75244.90);
insert into ProjectData values ('11-11079',2009, 466.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00);
insert into ProjectData values ('11-11079',2010, 855.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00) ;
insert into ProjectData values ('01-11052',2009, 56131.00, 0.00, 36962.00, -61596.00, 2428.00, 84.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00);

declare @mySQL varchar(MAX)

set @mySQL = 'select * from ProjectData'

exec pivot_query @mySQL, 'Project', 'Year', 'max(Jan) Jan,max(Feb) Feb,max(Mar) Mar,max(Apr) Apr,max(Jun) Jun,max(Jul) Jul,max(Aug) Aug,max(Sep) Sep,max(Oct) Oct,max(Nov) Nov,max(Dec) Dec'

Results:
Project              2008_Jan     2008_Feb     2008_Mar     2008_Apr     2008_Jun     2008_Jul     2008_Aug     2008_Sep     2008_Oct     2008_Nov     2008_Dec     2009_Jan     2009_Feb     2009_Mar     2009_Apr     2009_Jun     2009_Jul     2009_Aug     2009_Sep     2009_Oct     2009_Nov     2009_Dec     2010_Jan     2010_Feb     2010_Mar     2010_Apr     2010_Jun     2010_Jul     2010_Aug     2010_Sep     2010_Oct     2010_Nov     2010_Dec
-------------------- ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------ ------------
01-11052             NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         56131.00     .00          36962.00     -61596.00    84.00        .00          .00          .00          .00          .00          .00          NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL         NULL
11-11079             .00          .00          .00          .00          .00          .00          .00          .00          .00          .00          75244.90     466.00       .00          .00          .00          .00          .00          .00          .00          .00          .00          .00          855.00       .00          .00          .00          .00          .00          .00          .00          .00          .00          .00

Not exact but pretty darn close. :-)

Ron Savage
I forgot to add the month of May, but SO won't let me edit the post to put it in. :-)
Ron Savage
Holy crud, Ron. Why would you ever use that huge amount of procedural code when the problem can be solved with set-based code and existing SQL Server tools?
Emtucifor
It's generic and works for any query, without having to know the values of your pivot columns. Saved us metric tons of developer time across many "pivoted" style reports. I didn't write it just for this. :-)
Ron Savage
@Ron Savage: Ron, I have code like that too, where I invested a ton of time into some more general-purpose procedural code to solve a repeated problem. However... the SSRS Matrix report control is ALSO "generic and works for any query, without having to know the values of the pivot columns." Since you say you're doing it for reports, I can't imagine why anyone would want to use procedural code when SSRS can do the job out of the box.
Emtucifor
Not all of our applications have access or need to use the report server. We do have reports using the matrix as well, and we have found that for pivoting large result sets this proc is much faster - as only the summarized results are sent to the report server rather than all the raw data to be pivoted and summarized by the control.I'm not sure why you're calling it "procedural" either, it basically just formats the data and builds the PIVOT statement internally.
Ron Savage