ansaurus

Question

Finding Start and End Dates from Date Numbers Table (Date Durations)

Answer 1

A:

I haven't tested, but try:

select [Employee ID], [Project ID], start + ' to ' + end
from (
    select s.[Employee ID], s.[Project ID], min(d.Day) start, max(d.Day) end
    from [Employee Schedule] s
    inner join [Day Numbers] d on s.[Day ID] = d.[Day ID]
    group by s.[Employee ID], s.[Project ID]
) a

Edit: corrected some column names

For easier querying, I recommend you refactor the schema to:

[EmployeeSchedule]

ID 
EmployeeID 
ProjectID  
StartDate 
EndDate

and get rid of Day Numbers completely. That will make your queries simpler, more efficient, and will allow you to have records with NULL StartDates or EndDates if you wish.

RedFilter 2009-06-19 19:35:33

Does this deal with the 2 ranges for projectid = 2?

gbn 2009-06-19 19:36:55

@gbn. Yes it does.

Triptych 2009-06-19 19:38:23

That won't work. There may be gaps between the min and max that shouldn't be included in the duration.

hypoxide 2009-06-19 19:38:51

End of year will be interesting!

n8wrl 2009-06-19 19:39:08

@n8wrl - Luckily there's a numbers table so it won't matter. :)

hypoxide 2009-06-19 19:40:32

That's true, it will not handle multiple entries for same employee/project. Can I suggest you refactor your schema to have StartDay and EndDay columns in your [Employee Schedule] table? A query can be written to work with the existing schema, but it is more work and less efficient than if you change the schema.

RedFilter 2009-06-19 19:45:25

A schema refactor would hinder my ability to find gaps and overlaps in employee scheduling, which is why I'm using a numbers table for the dates.

hypoxide 2009-06-19 19:48:20

How so? You could do: /* gaps */select s.* from [Employee Schedule] s inner join [Day Numbers] d on d.[Day ID] not between s.StartDay and s.EndDay/* overlaps */select * from [Employee Schedule] s inner join [Employee Schedule] s2 on (s2.StartDay between s.StartDay and s.EndDay or s2.EndDay between s.StartDay and s.EndDay or s.StartDay between s2.StartDay and s2.EndDay or s.EndDay between s2.StartDay and s2.EndDay) and s.ID <> s2.ID

RedFilter 2009-06-19 20:36:02

Answer 2

+1 A:

Lets make a view to make things easier:

create view EmployeeProjectDates
as
select
    e.[Employee ID], e.[Project ID], d.Day
from
    [Employee Scchedule] e
    join [Day Numbers] d on e.[Day Id] = d.ID

You can do a query like this to get all the start dates:

select
    one.[Employee ID], one.[Project ID], one.Day as StartDate
from
    EmployeeProjectDays one
    left join EmployeeProjectDays two on one.[Employee ID] = two.[Employee ID] and one.[Project ID] = two.[Project ID] and one.Day = DATEADD(DAY, 1, two.Day)
where
    two.Day is null

And then do a similar query to get the end dates and match them up. I think that something like this would get you both.

select
    one.[Employee ID], one.[Project ID], one.Day as StartDate,
    (select
     min(two_end.Day)
 from
     EmployeeProjectDays one_end
     join EmployeeProjectDays two_end on one_end.[Employee ID] = two_end.[Employee ID] and one_end.[Project ID] = two_end.[Project ID] and one.Day = DATEADD(DAY, 1, two.Day)
     where
     one_end.Day is null
     and two_end.Day > one.Day) as EndDate
from
    EmployeeProjectDays one
    left join EmployeeProjectDays two on one.[Employee ID] = two.[Employee ID] and one.[Project ID] = two.[Project ID] and one.Day = DATEADD(DAY, 1, two.Day)
where
    two.Day is null

I haven't tested any of these queries, but something similar should work. I had to use a similar query before we implemented something in our application code to find the start and end dates.

Adam Hughes 2009-06-19 20:09:31

I tried your code and I got null values for all end dates. I don't quite understand the end-date nested join well enough to debug it.

hypoxide 2009-06-19 20:46:53

The subquery for the End Date is basically the same as the query for the start date, just reversed. It should give you a list of all the end dates, but for each row you only want the first end date that is after the start date.The subquery probably also needs to make sure the Employee ID and Project ID match the values from the main query, which I neglected.

Adam Hughes 2009-06-22 17:13:13

Answer 3

+3 A:

Assuming the Day IDs are always sequential for a partial solution...

select *
  from employee_schedule a                    
 where not exists( select *                          
                     from employee_schedule b        
                    where a.employeeid = b.employeeid
                      and a.projectid  = b.projectid 
                      and (a.dayid - 1) = b.dayid )

lists the start day IDs:

 ID      EMPLOYEEID       PROJECTID           DAYID 
 1              64               2             168 
 5              64               1             169 
 9              64               2             182 



select *
  from employee_schedule a                   
 where not exists( select *                         
                     from employee_schedule b       
                    where a.employeeid = b.employeei
                      and a.projectid  = b.projectid
                      and (a.dayid + 1) = b.dayid )

lists the end day IDs:

  ID      EMPLOYEEID       PROJECTID           DAYID 
  4              64               2             171 
  8              64               1             172 
 11              64               2             184

Paul Morgan 2009-06-19 20:39:01

even if they are not sequential, the query can be rewritten using actual dates with DATEADD function and the same comparison

van 2009-06-19 20:45:02

Answer 4

+1 A:

This one works with oracle, and starting from that it should be possible in SQL Server as well. (including testscript)

create table schedule (id number, employee_id number, project_id number, day_id number);

insert into schedule (id, employee_id, project_id, day_id)
values(1,64,2,168);
insert into schedule (id, employee_id, project_id, day_id)
values(2,64,2,169);
insert into schedule (id, employee_id, project_id, day_id)
values(3,64,2,170);
insert into schedule (id, employee_id, project_id, day_id)
values(4,64,2,171);
insert into schedule (id, employee_id, project_id, day_id)
values(5,64,1,169);
insert into schedule (id, employee_id, project_id, day_id)
values(6,64,1,170);
insert into schedule (id, employee_id, project_id, day_id)
values(7,64,1,171);
insert into schedule (id, employee_id, project_id, day_id)
values(8,64,1,172);
insert into schedule (id, employee_id, project_id, day_id)
values(9,64,2,182);
insert into schedule (id, employee_id, project_id, day_id)
values(10,64,2,183);
insert into schedule (id, employee_id, project_id, day_id)
values(11,64,2,184);
insert into schedule (id, employee_id, project_id, day_id)
values(11,65,3,184);

select * 
FROM (
    select  
        employee_id,
        project_id,
        first_day,
        nvl(last_day, 
            lead(last_day) over (
                partition by employee_id, project_id 
                order by nvl(first_day, last_day)
            )
        ) last_day
    from (
        select -- this identifies start and end rows of an interval
            employee_id,
            project_id,
            decode (day_id - prev_day, 1, null, day_id) first_day, -- uses day_id, if prev_day is not really the previous day, i.e. a gap or null
            decode (day_id - next_day, -1, null, day_id) last_day
        from (
            select -- this select adds columns for the previous and next day, in order to identify the boundaries of intervals 
                employee_id, 
                project_id, 
                day_id, 
                lead(day_id) over ( 
                    partition by employee_id, project_id 
                    order by day_id
                ) next_day,
                lag(day_id) over ( 
                    partition by employee_id, project_id 
                    order by day_id
                ) prev_day
            from schedule
        )
    )
    where first_day is not null 
    or last_day is not null-- just filter the rows, that represent start or end dates
) 
where first_day is not null

produces this output:

64  1 169 172
64  2 168 171
64  2 182 184
65  3 184 184

Jens Schauder 2009-06-19 21:29:58

That is a beautiful solution. If only there were lag and lead functions in T-SQL.

hypoxide 2009-06-22 20:43:36

ansaurus

tags:

views:

answers:

Finding Start and End Dates from Date Numbers Table (Date Durations)

related questions