views:

205

answers:

4

How to create a sql query to return the number of occurrences of sick days.

Where one occurrence is defined as consecutive days out sick. It would also be considered one occurrence if the a weekend fell within that time period or a holiday fell in that time period.

Examples of what is consider one occurrence:

  • A person is out sick Monday and Tuesday.
  • A person is out sick on a Friday and the following Monday.
  • A person is out sick Thursday, Friday is a Holiday, and Monday they are sick.

For these examples it would be considered three occurrences.

There is a table that contains the sick days (date) (one row for every sick day) and a table that contains the observed holiday dates.

To simplify the tables and fields:

tbl_emp
   empid
   empname

tbl_sick
   empid
   sickdate

tbl_holiday
   holiday
A: 

If i had a little more time, i would try to get a query out for you. But this really reminds me of a sql Challenge that is close enough to your issue.

Different Approaches

Explanation of Approaches

You are probably going to have to end up with a form of recursion or a tricky joining method to figure out if a sequence of days are what you prescribe. Hopes this helps you in your quest for the query.

Zielyn
+1  A: 

Logically, this should work, but it probably isn't the most elegant solution.

After creating a bit of sample data, I gather all of the 'suspect' days including known sick days, known holidays and weekends that are adjacent to either a sick day or a holiday. Then I identify the start and end of each group of consecutive days and for each employee count the start dates of ranges that contain a sick day.

/***** SAMPLE DATA *****/
declare @sick table (
    empid int,
    sick datetime
)

declare @holiday table (
    holiday datetime
)

/* Example 1 */
insert into @sick values (1,'2010/01/04'); /* Mon */
insert into @sick values (1,'2010/01/05'); /* Tue */

/* Example 2 */
insert into @sick values (1,'2010/01/15'); /* Fri */
insert into @sick values (1,'2010/01/18'); /* Mon */

/* Example 3 */
insert into @sick values (1,'2010/01/21'); /* Thu */
insert into @holiday values('2010/01/22'); /* Fri */
insert into @sick values (1,'2010/01/25'); /* Mon */

/* Extra Examples */
insert into @sick values (3,'2010/01/08'); 
insert into @sick values (2,'2010/01/08');
insert into @holiday values ('2010/01/11');

insert into @sick values (3,'2010/01/20');
insert into @sick values (3,'2010/01/21');

/* Extra Holiday */
insert into @holiday values ('2010/02/05');
/***** SAMPLE DATA *****/

/* First a CTE to gather all of the 'suspect' days together
   including known sick days, known holidays and weekends
   that are adjacent to either a sick day or a holiday */
with suspectdays as (

    /* Start with all Sick days */
    select
        empid,
        sick dt,
        'sick' [type]
    from
        @sick

    /* Add all Saturdays following a sick Friday */
    union
    select
        empid,
        DATEADD(day,1,sick) dt,
        'weekend' [type]
    from
        @sick
    where
        (DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 6

    /* Add all Sundays following a sick Friday */
    union
    select
        empid,
        DATEADD(day,2,sick) dt,
        'weekend' [type]
    from
        @sick
    where
        (DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 6

    /* Add all Sundays preceding a sick Monday */
    union
    select
        empid,
        DATEADD(day,-1,sick) dt,
        'weekend' [type]
    from
        @sick
    where
        (DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 2

    /* Add all Saturdays preceding a sick Monday */
    union
    select
        empid,
        DATEADD(day,-2,sick) dt,
        'weekend' [type]
    from
        @sick
    where
        (DATEPART(WEEKDAY,sick) + @@DATEFIRST) % 7 = 2

    /* Add all Holidays */
    union
    select
        empid,
        holiday dt,
        'holiday' [type]
    from
        @holiday,
        (select distinct empid from @sick) as a

    /* Add all Saturdays following a holiday Friday */
    union
    select
        empid,
        DATEADD(day,1,holiday) dt,
        'weekend' [type]
    from
        @holiday,
        (select distinct empid from @sick) as a
    where
        (DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 6

    /* Add all Sundays following a holiday Friday */
    union
    select
        empid,
        DATEADD(day,2,holiday) dt,
        'weekend' [type]
    from
        @holiday,
        (select distinct empid from @sick) as a
    where
        (DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 6

    /* Add all Sundays preceding a holiday Monday */
    union
    select
        empid,
        DATEADD(day,-1,holiday) dt,
        'weekend' [type]
    from
        @holiday,
        (select distinct empid from @sick) as a
    where
        (DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 2

    /* Add all Saturdays preceding a holiday Monday */
    union
    select
        empid,
        DATEADD(day,-2,holiday) dt,
        'weekend' [type]
    from
        @holiday,
        (select distinct empid from @sick) as a
    where
        (DATEPART(WEEKDAY,holiday) + @@DATEFIRST) % 7 = 2

),

/* Now a CTE to identify the start and end of each 
   group of consecutive days for each employee */
suspectranges as (
    select distinct
        sd.empid,
        (   select
                max(dt)
            from
                suspectdays
            where
                empid = sd.empid and
                DATEADD(day,-1,dt) not in (select dt from suspectdays where empid = sd.empid) and
                dt <= sd.dt
        ) rangeStart,
        (   select
                min(dt)
            from
                suspectdays
            where
                empid = sd.empid and
                DATEADD(day,1,dt) not in (select dt from suspectdays where empid = sd.empid) and
                dt >= sd.dt
        ) rangeEnd
    from
        suspectdays sd
)

/* For each employee count the start dates of ranges that contain a sick day */
select
    empid,
    COUNT(rangeStart) SickIncidents
from
    suspectranges sr
where
    exists (select * from suspectdays where dt between sr.rangeStart and sr.rangeEnd and empid=sr.empid and type='sick')
group by
    empid

For the sample data I created, here's the result.

empid       SickIncidents
----------- -------------
1           3
2           1
3           2
JC
A: 

Here's my attempt. Quite a lot of this query would lend itself to precalculation though rather than doing it all every time. Particularly having a table of working days with an incrementing sequence number.

--Base Tables
WITH tbl_holiday AS
(
SELECT CAST(2010-05-27 AS DATETIME) AS holidate UNION ALL
SELECT CAST(2010-05-28 AS DATETIME)
),
tbl_emp AS
(
SELECT 1 AS empid, 'Bob' AS empname UNION ALL
SELECT 2 AS empid, 'Dave' AS empname
),
tbl_sick AS
(
SELECT 1 AS empid, '2010-04-01' as sickdate UNION ALL
SELECT 1, '2010-04-02' UNION ALL
SELECT 1, '2010-04-09' 
),

--Calculated Tables
tbl_WorkingDays AS
(
SELECT dateadd(day,number,'2010-01-01') AS workdate
FROM master.dbo.spt_values 
WHERE Type='P' AND number <= DATEDIFF(day,'2010-01-01',getdate())
AND (@@datefirst + datepart(weekday, dateadd(day,number,'2010-01-01'))) % 7 not in (0, 1)
EXCEPT
SELECT * FROM tbl_holiday
),
tbl_NumberedWorkingDays AS
(
SELECT ROW_NUMBER() OVER (ORDER BY workdate) AS N,
workdate
FROM tbl_WorkingDays 
)



SELECT e.empid, e.empname, COUNT(nwd.N) AS Absences
FROM tbl_emp e
LEFT JOIN tbl_sick s ON e.empid = s.empid
LEFT JOIN tbl_NumberedWorkingDays nwd ON nwd.workdate = s.SickDate
WHERE (s.empid IS NULL) OR (nwd.N = 1) OR 
NOT EXISTS (SELECT * FROM tbl_sick s2 WHERE s.empid = s2.empid AND sickdate = (SELECT 
                   workdate FROM tbl_NumberedWorkingDays WHERE N = nwd.N-1))
GROUP BY e.empid, e.empname
Martin Smith
A: 

First, the simplest way to do analysis against dates is to have a calendar table which is a sequential list of dates. The advantage is that it makes it easy to indicate which days are not workdays (for whatever reason) and which days are holidays.

Create Table dbo.Calendar   (
                            [Date] Date not null primary key clustered
                            , IsHoliday bit not null default(0)
                            , IsWorkday bit not null default(1)
                            , Constraint CK_Calendar Check ( Case 
                                                                When IsHoliday = 1 And IsWorkDay <> 1 Then 0
                                                                Else 1
                                                                End = 1 )
                            )

The only additional restriction I made here is that if something is a holiday, then it must also be marked as not being a workday. Now let's fill our calendar table:

;With Numbers As
    (
    Select Row_Number() Over( Order By C1.object_id ) As Value
    From sys.columns As C1
        Cross Join sys.columns As C2
    )
    , CalendarItems As
    (
    Select N.Value, DateAdd(d, N.Value, '2000-01-01')As [Date]
    From Numbers As N
    Where DateAdd(d, N.Value, '2000-01-01') <= '2100-01-01'
    )
Insert Calendar( [Date], IsWorkDay )
Select [Date], Case When DatePart(dw, [Date]) In(1,7) Then 0 Else 1 End As IsWorkDay
From CalendarItems

I'm using a CTE to generate a sequential list of integers which I can then use to populate my table. Often, it is useful to have this table be static. In addition, I've marked days that are on Sunday or Saturday as not being workdays. In the above query, I arbitrarily filled my calendar table with dates from the year 2000 to 2100 however you can easily expand the range if you wish.

We can even update the table to account for your tbl_holiday table:

Update Calendar
Set IsHoliday = 1
From Calendar   
    Join tbl_Holiday
        On tbl_Holiday.holiday = Calendar.[Date]

Finally, we have our query to get the number of occurrances:

;With WorkDayNums As
    (
    Select C1.[Date]
        , Row_Number() Over ( Order By C1.[Date] ) As Seq
    From dbo.Calendar As C1
    Where C1.IsWorkDay = 1
    )
Select S.empid, Count(*) As SickOccurances
From WorkDayNums As WDN
    Join    (
            Select Min(S1.sickdate) MinSickDate, Max(S1.sickdate) As MaxSickDate
            From tbl_sick As S1
            ) As MinMax
        On WDN.[Date] Between MinMax.MinSickDate And MinMax.MaxSickDate
    Join tbl_sick As S
        On S.sickdate = WDN.[Date]
Where Exists    (
                Select 1
                From WorkDayNums As WDN2
                    Join tbl_sick As S2
                        On S2.sickdate= WDN2.[Date]
                Where WDN2.Seq = WDN.Seq + 1
                )
Group By S.empid

What I did here is to create a sequence for each workday. So if Friday was 1, Monday would be 2 barring any holidays. With this, I can easily see if the next workday was a sick day.

You did not make it clear in your OP about whether more than two sequential days counted as a single occurrence or multiple occurrences. In this approach, each two day combination would count as a single occurrence. So if someone where out Monday, Tuesday and Wednesday, the above query should count that as two occurrences.

Thomas