views:

50

answers:

4

Hi everyone,

I have a table (AU_EMPLOYEE) with two columns named EmployeeID (int) and LastModifiedDate (DateTime). Along with those columns are others containing additional employee data. This is an audit table and every time an employee's data changes in some way a new row is added.

So it's quite likely a given employee will have multiple rows in this table. I would like to retrieve the most recent record for each employee as determined by the LastModifiedDate. What is a good approach to doing this? Nested query or something along those lines?

Thanks for the suggestions.

+3  A: 

Assuming at least SQL 2005 so you can use a CTE:

EDIT: As I've pointed out here and here in the past, be sure to test performance. The CTE version with MAX will often outperform a ROW_NUMBER based solution.

;with cteMaxDate as (
    select EmployeeID, max(LastModifiedDate) as MaxDate
        from AU_EMPLOYEE
        group by EmployeeID
)
select e.EmployeeID, e.Column1, e.Column2, ...
    from cteMaxDate md
        inner join AU_EMPLOYEE e
            on md.EmployeeID= e.EmployeeID
                and md.MaxDate = e.LastModifiedDate
Joe Stefanelli
for frequent modifications, you can get more than one row per employee
AlexKuznetsov
Frequent being less than a second apart? I'm assuming you're referring to the rounding of [DateTime](http://msdn.microsoft.com/en-us/library/ms187819.aspx) precision to .000, .003, .007? If not, please explain further.
Joe Stefanelli
Yes updates may happen "less than a second apart". We even had some collisions on DB2, where the precision is in microseconds, not 3 milliseconds.
AlexKuznetsov
+6  A: 

You could use something like this to show the most recent row for each employee. This is a good use for the ROW_NUMBER function.

    with ranking as 
    (
        select *, ROW_NUMBER() over(partition by EmployeeID order by LastModifiedDate desc) as rn
        from AU_EMPLOYEE
    )
    select * from ranking where rn = 1
Mike Forman
+1, how I'd do it
KM
Thanks for all your replies-- appreciate it greatly.
larryq
+3  A: 
SELECT <your columns>
FROM (
SELECT <your columns>,
ROW_NUMBER() OVER(PARTITION BY EmployeeID ORDER BY LastModifiedDate DESC) AS rn
) AS t
WHERE rn=1
AlexKuznetsov
+2  A: 

Chris Pebble's answer is correct however a more general solution is

SELECT * FROM
(SELECT EmployeeID, LastModifiedDate
FROM AU_EMPLOYEE
WHERE LastModifiedDate<='X' ORDER BY LastModifiedDate Desc) A
GROUP BY A.EmployeeID

where X is the date you want to go back in time to.

mna
I think this will result in an error because `LastModifiedDate` is not part of the `group by` or used in an aggergate function. Not to mention that the derived table does not have an alias assigned to it.
KM
Added alias thanks. It does work since you want to group it by the employee ID and prune the dates that are less than 'X'. The descending ordering makes it so that the LastModified date kept after the grouping is the largest one. It works, I use it.
mna
I'm afraid not. `Msg 1033, Level 15, State 1, Line 4 The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.` after fixing that I get `Msg 8120, Level 16, State 1, Line 1 Column 'A.LastModifiedDate' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.` IN SQL Server when using a GROUP BY, you can not have an item in the select list unless it is in the GROUP BY or in an aggergate function like MAX(), MIN(), SUM(), etc.
KM
Didn't realize that sql-server pertains to MSSQL. Works with MySQL but at this point it's irrelevant.
mna
MSSQL = MicroSoft SQL = SQL Server
KM