views:

75

answers:

4

Here is the scenario: I have a table that records the user_id, the module_id, and the date/time the module was viewed.

eg.

Table: Log
------------------------------
User_ID  Module_ID   Date
------------------------------
1       red         2001-01-01
1       green       2001-01-02
1       blue        2001-01-03
2       green       2001-01-04
2       blue        2001-01-05
1       red         2001-01-06
1       blue        2001-01-07
3       blue        2001-01-08
3       green       2001-01-09
3       red         2001-01-10
3       green       2001-01-11
4       white       2001-01-12

I need to get a result set that has the user_id as the 1st column, and then a column for each module. The row data is then the user_id and the count of the number of times that user viewed each module.

eg.

---------------------------------
User_ID  red green   blue    white
---------------------------------
1       2   1       2       0
2       0   1       1       0
3       1   2       1       0
4       0   0       0       1

I was initially thinking that I could do this with PIVOT, but no dice; the database is a converted SQL Server 2000 DB that is running in SQL Server 2005. I'm not able to change the compatibility level, so pivot is out.

The other catch is that the modules will vary, and it isn't feasible to re-write the query every time a module is added or removed. This means that I can't hard-code in the modules because I don't know in advance which will and will not be installed.

How can I accomplish this?

A: 
SELECT User_ID, MAX(red) AS red, MAX(green) AS green, MAX(blue) AS blue,
  MAX(white) AS white FROM
((SELECT User_ID, COUNT(Module_ID) AS red, 0 AS green, 0 AS blue,
  0 AS white
FROM log
WHERE Module_ID = 'red'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, COUNT(Module_ID) AS green, 0 AS blue,
  0 AS white
FROM log
WHERE Module_ID = 'green'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, 0 AS green, COUNT(Module_ID) AS blue,
  0 AS white
FROM log
WHERE Module_ID = 'blue'
GROUP BY User_ID)
UNION
(SELECT User_ID, 0 AS red, 0 AS green, 0 AS blue,
  COUNT(Module_ID) AS white
FROM log
WHERE Module_ID = 'white'
GROUP BY User_ID))
GROUP BY User_ID
ORDER BY User_ID
Marcus Adams
The installed modules are not fixed, so a discrete enumeration of the modules is not feasible. Is there a dynamic way to do this?
cdeszaq
+5  A: 

PIVOT can be simulated with CASE and GROUP BY

select
    [user_id],
    sum(case when [Module_ID] = 'red' then 1 else 0 end) as red,
    sum(case when [Module_ID] = 'green' then 1 else 0 end) as green,
    sum(case when [Module_ID] = 'blue' then 1 else 0 end) as blue,
    sum(case when [Module_ID] = 'white' then 1 else 0 end) as white
from [log]
group by
    [user_id]

Of course this doesn't work if the modules vary (as stated in the question) but then, PIVOT has the same problem.

Dynamically generating some sql overcomes this problem but this solution smells a bit!

declare @sql nvarchar(max)

set @sql = '
select
    [user_id],'

select @sql = @sql + '
    sum(case when [Module_ID] = ''' + replace([Module_ID], '''','''''') + ''' then 1 else 0 end) as [' + replace([Module_ID], '''','') + '],'
from (select distinct [Module_ID] from [log]) as moduleids

set @sql = substring(@sql,1,len(@sql)-1) + '
from [log]
group by
    [user_id]
'
print @sql
exec sp_executesql @sql

Note that this may be vulnerable to sql-injection if the module id data can't be trusted.

Daniel Renshaw
Is there a way to make that block of sum/case statements dynamic? The installed modules is not fixed, so a discrete enumeration of the modules is not feasible.
cdeszaq
Spotted that just before you posted. PIVOT has the same problem - you have to hard-code the columns you want up front.
Daniel Renshaw
@Daniel Renshaw - There is a dynamic way to get the columns, illustrated here: http://stackoverflow.com/questions/2344590/how-do-i-transform-rows-into-columns-in-sql-server-2005/2344668#2344668 but that doesn't work for me due to not being able to use pivot. Is there a way to blend the two?
cdeszaq
Yeah, can dynamically gen sql as in my edited answer, but I don't like it!
Daniel Renshaw
A: 

I believe characteristic functions are what you want.

Carl Manaster
A: 

Using MySQL I did this:

  1. Copied your data into Log_Table.sql

  2. create table Log (User_ID mediumint, Module_ID CHAR(5), dte CHAR(10));

  3. load data infile 'Log_Table.sql' INTO TABLE Log FIELDS TERMINATED BY ',';

  4. Pivot:

select User_ID AS 'USER', sum(case Module_ID WHEN 'red' then 1 else 0 END) AS 'red',

sum(case Module_ID WHEN 'green' then 1 else 0 END) AS 'green',

sum(case Module_ID WHEN 'blue' then 1 else 0 END) AS 'blue',

sum(case Module_ID WHEN 'white' then 1 else 0 END) AS 'white'

from Log

Group By User_ID;

+------+------+-------+------+-------+

| USER | red | green | blue | white |

+------+------+-------+------+-------+

| 1 | 2 | 1 | 2 | 0 |

| 2 | 0 | 1 | 1 | 0 |

| 3 | 1 | 2 | 1 | 0 |

| 4 | 0 | 0 | 0 | 1 |

+------+------+-------+------+-------+

4 rows in set (0.00 sec)

Hope this helps.

okaygo