views:

36

answers:

2

How would I select all rows distinct on Forename and Surname and where there's a duplicate select the one with the higher SomeDate, then Id if still duplicates e.g.

For:

| Id | Forename | Surname | SomeDate   |
----------------------------------------
| 1  | Bill     | Power   | 2011-01-01 |
| 2  | James    | Joyce   | 2011-02-01 |
| 3  | Peter    | Lennon  | 2011-03-01 |
| 4  | John     | Sellers | 2011-04-01 |
| 5  | James    | Joyce   | 2011-05-01 |
| 6  | Peter    | Lennon  | 2011-03-01 |

Results in:

| Id | Forename | Surname | SomeDate   |
----------------------------------------
| 1  | Bill     | Power   | 2011-01-01 |
| 4  | John     | Sellers | 2011-04-01 |
| 5  | James    | Joyce   | 2011-05-01 |
| 6  | Peter    | Lennon  | 2011-03-01 |

How could I achieve this in

  1. T-SQL
  2. From a DataTable using C#
+2  A: 

Assuming SQL Server 2005+, use:

SELECT x.id,
       x.forename,
       x.surname,
       x.somedate
  FROM (SELECT t.id,
               t.forename,
               t.surname,
               t.somedate,
               ROW_NUMBER() OVER (PARTITION BY t.forename, t.surname 
                                      ORDER BY t.somedate DESC, t.id DESC) AS rank
          FROM TABLE t_ x
WHERE x.rank = 1

A risky approach would be:

  SELECT MAX(t.id) AS id,
         t.forename,
         t.surname,
         MAX(t.somedate) AS somedate
    FROM TABLE t
GROUP BY t.forename, t.surname
OMG Ponies
+1  A: 

I'd tend to use subselects for the non-grouped values.

SELECT Forename, Surname, 
    (SELECT TOP 1 Id FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
     ORDER BY m.SomeDate DESC) AS Id
    (SELECT TOP 1 SomeDate FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
     ORDER BY m.SomeDate DESC) AS SomeDate
FROM myTable m
GROUP BY Forename, Surname

Or you can filter it in the WHERE clause:

SELECT Id, Forename, Surname, SomeDate
FROM myTable m
WHERE m.Id = (SELECT TOP 1 Id FROM myTable mt WHERE mt.Forename = m.Forename AND mt.Surname = m.Surname
    ORDER BY m.SomeDate DESC)

I'm afraid neither will be terribly efficient, but index toning will ameliate that if needed.

For a datatable example, you'd do essentially the same thing.

var recs = from record in dataTable
           where record.Id == 
               (from rec in dataTable
                where rec.Forename == record.Forename && rec.Surname == record.Surname
                orderby rec.SomeDate descending
                select rec.Id).First()
           select record;
Jacob Proffitt