views:

4711

answers:

4

I have a couple of tables which are used to log user activity for an application. The tables looks something like this (pseudo code from memory, may not be syntactically correct):

create table activity (
  sessionid uniqueidentifier not null,
  created smalldatetime not null default getutcdate()
);

create table activity_details (
  sessionid uniqueidentifier not null,
  activity_description varchar(100) not null,
  created smalldatetime not null default getutcdate()
);

My goal is to populate a summary table for reporting purposes that looks something like this:

create table activity_summary (
  sessionid uniqueidentifier not null,
  first_activity_desc varchar(100) not null,
  last_activity_desc varchar(100) not null
);

First and last activity descriptions would be determined chronologically. My initial thought is to update the summary table like so:

truncate table activity_summary;

insert into activity_summary (sessionid)
select sessionid from activity;

update table activity_summary set
  first_activity_desc = (select top 1 activity_desc from activity_detail where sessionid = as.sessionid order by created asc),
  last_activity_summary = (select top 1 activity_desc from activity_detail where sessionid = as.sessionid order by created desc)
from activity_summary as;

However, this seems incredibly verbose and unnecessary to me. I'm just not sure how to shrink it down. My gut feel is that I could do it somehow all within the insert statement, but I'm stumped. Any suggestions?

+1  A: 

There's probably more efficient ways to do this as well, but this is closest to your original:

truncate table activity_summary;

insert into activity_summary (sessionid, first_activity_desc, last_activity_summary)
select a.sessionid
,(select top 1 ad.activity_desc from activity_detail AS ad where ad.sessionid = a.sessionid order by ad.created asc) AS first_activity_desc
,(select top 1 ad.activity_desc from activity_detail AS ad where ad.sessionid = a.sessionid order by ad.created desc) AS last_activity_summary
from activity AS a;

Something like this might be more efficient:

truncate table activity_summary;

WITH firsts AS (
    SELECT ad.sessionid
        ,ad.activity_desc
        ,ROW_NUMBER() OVER (ORDER BY ad.created ASC) as RowNumber
    FROM activity_detail AS ad
)
,lasts AS (
    SELECT ad.sessionid
        ,ad.activity_desc
        ,ROW_NUMBER() OVER (ORDER BY ad.created DESC) as RowNumber
    FROM activity_detail AS ad
)
insert into activity_summary (sessionid, first_activity_desc, last_activity_summary)
select a.sessionid
    ,firsts.activity_desc
    ,lasts.activity_desc
from activity AS a
INNER JOIN firsts ON firsts.sessionid = a.sessionid AND firsts.RowNumber = 1
INNER JOIN lasts ON lasts.sessionid = a.sessionid AND lasts.RowNumber = 1
Cade Roux
Thanks. I don't know why I didn't think of that - I guess I just hate wrapping select statements as column values like that too. In either case, the number of potential reads generated by the query is enormous - that's what I'm looking to reduce.
Chris
I will do one without the nested queries for you.
Cade Roux
+1  A: 
insert into activity_summary
    (sessionid, first_activity_desc, last_activity_desc)
select
    agg.sessionid,
    adf.activity_description,
    adl.activity_description
from
    (SELECT
         sessionid, MIN(created) as firstcreated, MAX(created) as lastcreated
    from
         activity_detail group by sessionid
    ) agg
    JOIN
    activity_details adf ON agg.sessionid = adf.sessionid AND agg.firstcreated = adf.created
    JOIN
    activity_details adl ON agg.sessionid = adl.sessionid AND agg.lastcreated = adl.created
gbn
Almost what I want. I see a big hole though in that it is entirely possible that two activity descriptions may have the same timestamp.
Chris
The same issue applies to the top and subclause approach.Unless there is another column (eg identity) or the created is datetime (still not foolproof), then there is no information to tie-break equal top and tail rows
gbn
You're right. Perhaps I should add an identity column to the detail table and min/max off that.
Chris
There is no tiebreaker, but you will not get the extra rows.
Cade Roux
ah yes, good point to note...Min/max/top all give one row. One of equal top or equal tail rows will be chosen (effectively random)
gbn
A: 

Roughly,

INSERT etc.

SELECT a.sessionid, d1.activity_description, d2.activity_description

FROM activity a

JOIN detail d1 ON a.sessionid = d1.sessionid JOIN detail d2 ON a.sessionid = d2.sessionid

WHERE NOT EXISTS
(SELECT 1 FROM detail WHERE sessionid = a.sessionid AND created < d1.created)

AND NOT EXISTS
(SELECT 1 FROM detail WHERE sessionid = a.sessionid AND created > d2.created)

le dorfier
A: 

Or,

INSERT etc.

(SELECT description FROM detail d1 WHERE d1.sessionid = a.sessionid
AND NOT EXISTS (SELECT 1 FROM detail WHERE created < d1.created)) AS desc1,

(SELECT description FROM detail d2 WHERE d2.sessionid = a.sessionid
AND NOT EXISTS (SELECT 1 FROM detail d2 WHERE created > d1.created)) AS desc2

FROM activity a

(I prefer this one myself.)

le dorfier