views:

1066

answers:

4

Hi all,

I have a couple of tables which look like this Table 1

user_id   |  name
-------------------------   
x111      |   Smith, James
x112      |   Smith, Jane

etc..

Table 2

id    |   code    |    date       |  incident_code    | user_id
-----------------------------------------------------------------
1     |    102008 |   10/20/2008  |    1              | x111
2     |    113008 |   11/30/2008  |    3              | x111
3     |    102008 |   10/20/2008  |    2              | x112
4     |    113008 |   11/30/2008  |    5              | x112

What i'd like to display is something like this

user_id     |    user_name    |   INCIDENT IN OCT 2008   | INCIDENT IN NOV 2008
------------------------------------------------------------------------------ 
x111        |    Smith, John  |   1                      | 3
x112        |    Smith, Jane  |   2                      | 5

etc..

The incident_code would be replaced by the actual description of the incident which is located in another table, but i thought i'd see how this would work first.

Some of the column headers would be static while others would be created based on the date. Does anyone one know how i can do this using sql server 2005? Some examples would be very helpful.

Thanks in advance

A: 

You want to Pivot http://msdn.microsoft.com/en-us/library/ms177410.aspx

Tetraneutron
A: 

This sounds like a reporting task. Reporting, often referred to from a database perspective as OLAP, Online Aanalytical Processing, tends to differ quite frequently from "traditional" database access, OLTP (Online Transaction Processing) in that it is quite often made up of large aggregations of data spanning greater periods of time. Quite frequently, the kind of aggregation your looking for.

Use of a Pivot as Tetraneutron suggested will be sufficient for smaller data sets. However, as the volume of data you need to report on grows, you may need something more advanced. OLAP is provided for by SQL Server Analysis Services (SSAS), available in 2005 and 2008. Using SSAS you can create multidimensional data repositories that pre-aggregate data from either an OLTP database directly, or from an intermediary data warehouse database. Multidimensional data (usually referred to as cubes), provide a much faster way to access the kind of data you can get from a Pivot, without interfering with the performance of your standard transaction processing in your OLTP database.

If you have more than a small amount of data you need to report on, I recommend you check out SQL Server Analysis Services 2005, OLAP, Cubes, and MDX (Multidimensional Extensions for T-SQL.) There is a larger learnig curve to set up an OLAP Cube, but once it is set up, the benefits of having one can be huge if you have significant reporting needs.

jrista
A: 

A query like this would work:

select
    u.User_id,
    u.Name,
    Okt2008Sum = sum(case when i.date between 
        '2008-10-01' and '2008-11-01' then 1 else 0 end),
    Nov2008Sum = sum(case when i.date between 
        '2008-11-01' and '2008-12-01'then 1 else 0 end)
from #incidents i
inner join #users u on i.user_id = u.user_id
group by u.user_id, u.name

Depending on your client and how often you have to run it, you can generate this query. In SQL this would look like:

create table #months (
 MonthName varchar(25),
 StartDate datetime
)

insert into #months values ('Okt2008','2008-10-01')
insert into #months values ('Nov2008','2008-11-01')

declare @query varchar(8000)
select @query = 'select u.User_id, u.Name '

select @query = @query + ', ' + MonthName + 
    ' = sum(case when i.date between ''' + cast(StartDate as varchar) + 
    ''' and ''' + cast(dateadd(m,1,StartDate) as varchar) + 
    ''' then 1 else 0 end) '
from #Months

select @query = @query + '
    from #incidents i
    inner join #users u on i.user_id = u.user_id
    group by u.user_id, u.name'

exec (@query)
Andomar
+2  A: 

Here's a solution which generates and runs the dynamic SQL with a PIVOT:

DECLARE @pivot_list AS VARCHAR(MAX)

--
;
WITH    cols
          AS ( SELECT DISTINCT
                        'INCIDENT IN ' + LEFT(UPPER(CONVERT(VARCHAR, [date], 107)),
                                              3) + ' '
                        + SUBSTRING(UPPER(CONVERT(VARCHAR, [date], 107)), 9, 4) AS col
               FROM     so926209_2
             )
    SELECT  @pivot_list = COALESCE(@pivot_list + ', ', '') + '[' + col + ']'
    FROM    cols

--
DECLARE @template AS VARCHAR(MAX)
SET @template = 'WITH incidents AS (
SELECT  [user_id],
        incident_code,
        ''INCIDENT IN '' + LEFT(UPPER(CONVERT(VARCHAR, [date], 107)), 3)
        + '' '' + SUBSTRING(UPPER(CONVERT(VARCHAR, [date], 107)), 9, 4) AS col
FROM    so926209_2
)
,results AS (
SELECT * FROM incidents PIVOT (MAX(incident_code) FOR col IN ({@pivot_list})) AS pvt
)
SELECT results.[user_id]
    ,so926209_1.[name]
    ,{@select_list}
FROM results INNER JOIN so926209_1 ON so926209_1.[user_id] = results.[user_id]
'

DECLARE @sql AS VARCHAR(MAX)
SET @sql = REPLACE(REPLACE(@template, '{@pivot_list}', @pivot_list), '{@select_list}', @pivot_list)

--PRINT @sql
EXEC (@sql)

Where so926209_1, so926209_2 are your table 1 and table 2

Note that if you have multiple incidents in a month for the same person, your example doesn't show how you want that handled. This example only takes the last incident in the month.

Cade Roux
thanks cade. It shouldn't be that hard to change that so that it includes a date as well. There won't be more than one incident for each date.Do you happen to know how slow this might be for a million users in the user table and around 12 million incidents in the incident table?
zSysop
It wouldn't be the kind of thing I would run every second! I've got some pretty big PIVOT and UNPIVOT operations in some backend processing we do and we'll go from a few million rows and PIVOT up to a few hundred thousand or go from a few million rows and UNPIVOT to tens of millions, but some of these operations take minutes (8-way 32GB, SQL 2005 Enterprise). The string building before the PIVOT to make the headers exactly the way you specified is not the most efficient way to get them grouped - I would use an abbreviation like YYYYMM in an int which will be far fewer bytes.
Cade Roux
And then your {@pivot_list} will look a lot different from your {@select_list}, which will be generated a little differently to contain things like [200801] AS [INCIDENT IN JAN 2008]. The DISTINCT will be faster, too. And the column order will not be alphabetical. I didn't bother to do that - I think this makes the general PIVOT technique clear with code generation, either just in time or regularly generated in advance.
Cade Roux
Thanks for all your help!
zSysop