views:

68

answers:

1

I have a table as follows

> RowID   SessionID       EventID  RequestedURL   Date
> 1       m2jqyc45g       1        Start          24/03/2010 19:52
> 2       m2jqyc45g       1        ProductPage    24/03/2010 19:52
> 3       m2jqyc45g       28       BuyNow         24/03/2010 19:52
> 4       tjmvb55dc4dg    1        ProductPage    24/03/2010 19:52
> 5       tjmvb55dc4dg    1        BuyNow         24/03/2010 19:56
> 6       tjmvb55dc4dg    1        Clicked OK     24/03/2010 19:56
> 7       m2jqyc45g       1        Clicked OK     24/03/2010 19:56
> 8       tjmvb55dc4dg    28       Help           24/03/2010 19:56
> 9       m2jqyc45g       1        Home           24/03/2010 19:56
> 6       m2jqyc45g       1        ProductPage    24/03/2010 19:56
> 7       tjmvb55dc4dg    1        BuyNow         24/03/2010 19:56
> 8       tjmvb55dc4dg    28       Clicked OK     24/03/2010 19:56
> 9       tjmvb55dc4dg    1        Home           24/03/2010 19:56

How do I write a query that does a count whenever the rows BuyNow and Clicked OK have been recorded consecutively in a particular session? For example, the dataset above shows 2 users on the site recording separate sessions. The return count should be 3.

I'm running SQL Server 2008

EDIT Just to clarify What I mean by consecutive rows and why the count should return 3. In the session m2jqyc45g, the BuyNow and Clicked Ok occurs consecutive within the session but when it's entered into the table, there are other concurrent sessions too which breaks the sequence. If you were to order the result set by SessionID and then by Date, you will get them one after the other. Also, there was a typo with RowID7, ProductPage should be BuyNow. Sorry for that.

+3  A: 
SELECT COUNT(*)
FROM yourTable y
WHERE RequestedURL = 'BuyNow'
    AND Exists (SELECT * 
             FROM yourTable x 
             WHERE x.RowID = (SELECT MIN(RowID) 
                              FROM yourTable z 
                              WHERE z.SessionID = y.SessionID
                               AND z.RowID > y.RowID)
                 AND RequestedURL = 'Clicked')

This should do it. There might be a faster way but I am not sure what else you can do. Also, you will probably want to but indexes on RowID and SessionID.

Your other option is to use Common Table Expressions to get row IDs that are group by session and time. The below could should do that for you.

WITH temp AS (
SELECT
    SessionID,
    RequestedURL,
    [Date],
    RN = ROW_NUMBER() OVER (ORDER BY SessionID,[Date])
FROM yourTable
ORDER BY SessionID,[Date])
SELECT
    COUNT(*)
FROM temp x
    JOIN temp y ON x.SessionID = y.SessionID
        AND x.RN = y.RN - 1
WHERE x.RequestedURL = 'BuyNow'
    AND y.RequestedURL = 'Clicked'
RandomBen
I don't think this will meet the requirement - see session `m2jqyc45g` in the sample data
Ed Harper
Agreed, I fixed it. It should do exactly what he is looking for now.
RandomBen
doesn't work im afraid. whats up with the " 1 = ..... "
Nai
You are correct it doesn't work. I fixed it by added the AND z.RowID > y.RowID. There is definitely another way to do this with Common Table Expressions. I will look into the Syntax for that later and get back.
RandomBen
CTE Added check that out. Also, I figured my earlier code. The 'Clicked' has a space in it. That could have caused your issue.
RandomBen
I prefer your CTE method :) Sweet thanks!
Nai