views:

29

answers:

2

I have a table that looks like this:

Id    PageId    Key    Content    LastUpdated
---------------------------------------------

 1     12       key1    content1   02-21-2010
 2     12       key1    content2   02-25-2010
 3     12       key2    content1   02-21-2010
 4     12       key2    content2   02-25-2010
 5     12       key3    content1   02-21-2010

What I need to be able to do is query out a distinct list by PageId ordered by LastUpdated so my result would look like this:

2     12       key1    content2
4     12       key2    content2
5     12       key3    content1

I am using Linq-2-Sql but if I needed to I could use an SP. Any help on how to accomplish this would be great, thanks!

+3  A: 

Here's one idea using just tsql that you could put in an SP. (Assumes that LastUpdated is a datetime field and not a string in the format you've listed and assumes at least SQL 2005 to use the CTE. Either of these assumptions can be overcome if they're incorrect.):

;with cteMaxUpdated as (
    select Key, max(LastUpdated) as MaxUpdated
        from YourTable
        group by Key
)
select yt.Id, yt.PageId, yt.Key, yt.Content
    from cteMaxUpdated mu
        inner join YourTable yt
            on mu.Key = yt.Key
                and mu.MaxUpdated = yt.LastUpdated
Joe Stefanelli
@Joe- Is there any performance difference between your solution and Frank's?
Paul
I don't think so, but can't be 100% sure. Try each and compare execution plans, IO stats, etc.
Joe Stefanelli
I did just remember that in [another question](http://stackoverflow.com/questions/3354290/are-these-two-queries-the-same-group-by-vs-distinct/3354449#3354449) using MAX outperformed ROWNUMBER. Again, test each for your specific case.
Joe Stefanelli
+1  A: 

In SQL, this would work as follows:

WITH i as (
  SELECT Id, PageId, Key, Content, dense_rank() over(PARTITION BY Key ORDER BY LastUpdated desc) as pos
    FROM myTable
)
SELECT Id, PageId, Key, Content
  FROM i
 WHERE pos = 1

I assume here that you meant "distinct list by Key" instead of "distinct list by PageId".

You should be aware that in the case that the date per key is not unique, this would return all the records for the maximum date of a key. If you just want to have one of them, you could add additional fields to the ORDER BY.

And if you do not care which one of several non-unique records per key to return, you could use ROW_NUMBER() instead of DENSE_RANK().

Frank
@Frank - This works great. Is there any performance difference between your solution and Joe's?
Paul
@Paul I would assume so as there are no joins. But that depends on many factors, so it would be best to test your specific situation.
Frank