views:

16

answers:

0

I have a MASTER object table and an ACTION_HISTORY table that chronicles when a user has interacted with the object.

I need to know when the last action was for each object. I am stuck between two approaches.

Here are some simplified test tables

CREATE TABLE MasterTable (ID INT IDENTITY(1,1) NOT NULL, someData varchar(20) NOT NULL
 CONSTRAINT [PK_MASTER] PRIMARY KEY CLUSTERED ([ID] ASC))

CREATE TABLE HistoryTable (ID int identity(1,1), MasterTable_ID int, DT DATETIME
    CONSTRAINT [PK_HISTORY] PRIMARY KEY CLUSTERED ([ID] ASC))

Add test data

--insert test data
TRUNCATE TABLE MasterTable
TRUNCATE TABLE HistoryTable
GO

DECLARE @baseDate AS SMALLDATETIME
SET @baseDate = '1/1/1950'

DECLARE @ctr AS INT SET @ctr = 1
DECLARE @hCtr AS INT
WHILE @ctr < 10000
BEGIN
    INSERT INTO MasterTable VALUES ( 'blah blah blah')
    SET @hCtr = 1
    --only every third master has a history
    IF @ctr%3 = 0
    BEGIN
        WHILE @hCtr < 12
        BEGIN
            INSERT INTO HistoryTable VALUES  (@ctr, DATEADD(d, @ctr, @baseDate))
            SET @hCtr = @hCtr + 1
        END
    END

    SET @ctr = @ctr+1
END

The first approach joins the history table to itself to get the most recent

DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

-- using JOINS
SELECT m.ID, m.someData, ISNULL(h.DT, GETDATE()) LAST_HISTORY
FROM MasterTable m 
 LEFT JOIN HistoryTable h ON m.ID=h.MasterTable_ID
 LEFT JOIN HistoryTable h2 ON h.MasterTable_ID = h2.MasterTable_ID AND h.ID < h2.ID
 WHERE h2.ID IS null

The second uses ROW_NUMBER

DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
--  using ROW_NUMBER
select ID, someData, ISNULL(DT, GETDATE()) LAST_HISTORY
from(
    select m.ID, m.someData, h.DT, ROW_NUMBER() over(partition by m.ID order by h.DT desc) selector
    from MasterTable as m
    left outer join HistoryTable as h
    on m.ID = h.MasterTable_ID) as x
where x.selector=1

For reasons beyond this post, I am having challenges testing these against my actual data. I'm basically trying to understand whether the sort on the history date in the second approach is better than the cost of the join on the first. I admit that I more often than not find myself squinting at Execution Plans with the sound of crickets in my head. The join approach seems to be more common, but I don't know if that's just convention.