tags:

views:

43

answers:

3

I have a table with many IDs and many dates associated with each ID, and even a few IDs with no date. For each ID and date combination, I want to select the ID, date, and the next largest date also associated with that same ID, or null as next date if none exists.

Sample Table:

ID      Date
1       5/1/10
1       6/1/10
1       7/1/10
2       6/15/10
3       8/15/10
3       8/15/10
4       4/1/10
4       4/15/10
4       

Desired Output:

ID       Date       Next_Date
1        5/1/10     6/1/10
1        6/1/10     7/1/10
1        7/1/10     
2        6/15/10    
3        8/15/10    
3        8/15/10    
4        4/1/10     4/15/10
4        4/15/10    
+1  A: 
SELECT id, date, (SELECT date FROM table where date>t1.date order by date limit 1) FROM table t1
Андрей Костенко
+1  A: 

If your db is oracle, you can use lead() and lag() functions.

SELECT id, date, 
LEAD(date, 1, 0) OVER (PARTITION BY ID ORDER BY Date DESC NULLS LAST) NEXT_DATE,
FROM Your_table
ORDER BY ID;
Bharat
+2  A: 
SELECT
    mytable.id,
    mytable.date,
    (
        SELECT
            MIN(mytablemin.date)
        FROM mytable AS mytablemin
        WHERE mytablemin.date > mytable.date
    ) AS NextDate
FROM mytable

This has been tested on SQL Server 2008 R2 (but it should work on other DBMSs) and produces the following output:

id          date                    NextDate
----------- ----------------------- -----------------------
1           2010-05-01 00:00:00.000 2010-06-01 00:00:00.000
1           2010-06-01 00:00:00.000 2010-06-15 00:00:00.000
1           2010-07-01 00:00:00.000 2010-08-15 00:00:00.000
2           2010-06-15 00:00:00.000 2010-07-01 00:00:00.000
3           2010-08-15 00:00:00.000 NULL
3           2010-08-15 00:00:00.000 NULL
4           2010-04-01 00:00:00.000 2010-04-15 00:00:00.000
4           2010-04-15 00:00:00.000 2010-05-01 00:00:00.000
4           NULL                    NULL

Update 1: For those that are interested, I've compared the performance of the two variants in SQL Server 2008 R2 (one uses MIN aggregate and the other uses TOP 1 with an ORDER BY):

Without an index on the date column, the MIN version had a cost of 0.0187916 and the TOP/ORDER BY version had a cost of 0.115073 so the MIN version was "better".

With an index on the date column, they performed identically.

Note that this was testing with just these 9 records so the results could be (very) spurious...

Update 2: The results hold for 10,000 uniformly distributed random records. The TOP/ORDER BY query takes so long to run at 100,000 records I had to cancel it and give up.

Daniel Renshaw
it is better to use ordering than aggregative functions. It you have a big table especially
Андрей Костенко
@Andrii: I can't speak for other DBs but on SQL Server it shouldn't make a difference. If there's an index it's clever enough to know it can just read the first row; if there's no index it has to scan the whole table either way. In fact, with an ORDER BY it might be slower since it would have to do an O(n*lg(n)) sort instead of an O(n) scan.
Daniel Renshaw
This is a mssql db with the query being passed through via access which means the LIMIT query wouldn't work anyways. The min query above worked perfectly with the slight addition of adding mytable.id = mytablemin.id in the WHERE statement. The query is a little sluggish, but there isn't an index on the date field I'm using right now.Thanks all for the help.
John