views:

92

answers:

5

We have a table set up as follows:

|ID|EmployeeID|Date     |Category       |Hours|
|1 |1         |1/1/2010 |Vacation Earned|2.0  |
|2 |2         |2/12/2010|Vacation Earned|3.0  |
|3 |1         |2/4/2010 |Vacation Used  |1.0  |
|4 |2         |5/18/2010|Vacation Earned|2.0  |
|5 |2         |7/23/2010|Vacation Used  |4.0  |

The business rules are:

  • Vacation balance is calculated by vacation earned minus vacation used.
  • Vacation used is always applied against the oldest vacation earned amount first.

We need to return the rows for Vacation Earned that have not been offset by vacation used. If vacation used has only offset part of a vacation earned record, we need to return that record showing the difference. For example, using the above table, the result set would look like:

|ID|EmployeeID|Date     |Category       |Hours|
|1 |1         |1/1/2010 |Vacation Earned|1.0  |
|4 |2         |5/18/2010|Vacation Earned|1.0  |

Note that record 2 was eliminated because it was completely offset by used time, but records 1 and 4 were only partially used, so they were calculated and returned as such.

The only way we have thought of to do this is to get all of the vacation earned records in a temporary table. Then, get the total vacation used and loop through the temporary table, deleting the oldest record and subtracting that value from the total vacation used until the total vacation used is zero. We could clean it up for when the remaining vacation used is only part of the oldest vacation earned record. This would leave us with just the outstanding vacation earned records.

This works, but it is very inefficient and performs poorly. Also, the performance will just degrade over time as more and more records are added.

Are there any suggestions for a better solution, preferable set based? If not, we'll just have to go with this.

EDIT: This is a vendor database. We cannot modify the table structure in any way.

A: 

As time passes and records are added, performance will get worse and worse unless you do something about it, such as:

  • Purge old rows once they're "cancelled out" (e.g. vacation earned has had equivalent vacation used rows added and accounted for; vacation used has been used set "expire" vacation earned as "expended")
  • Add a column that flags if a a row has been "cancelled out", and incorporate this column into your indexes

Tracking how the data changes in this fashion seems an argument to modify your table sturctures (have several, not just one), but that's outside the scope of your current problem.

As for the query itself, I'd build two aggregates, do some subtraction, make that a subquery, then join it on some clever use of one of the ranking functions. Smells like a correlated subquery in there somewhere, too. I may try and hash this out later (I'm short on time), but I bet someone beats me to it.

Philip Kelley
I should have mentioned this is a vendor database (see edit above). Therefore, no table modifications are possible.
NYSystemsAnalyst
+2  A: 

In thinking about the problem, it occurred to me that the only reason you need to care about when vacation is earned is if it expires. And if that's the case, the simplest solution is to add 'vacation expired' records to the table, such that the amount of vacation remaining for an employee is always just the sum(vacation earned) - (sum(vacation expired) + sum(vacatation used)). You can even show the exact records you want by using the last vacation expired record as a starting point for the query.

But I'm guessing that's not an option. To address the problem as asked, keep in mind that whenever you find yourself using a temporary table try putting that data into CTE (common table expression) instead. Unfortunately I have a meeting right now and so I don't have time to write the query (maybe later, it sounds like fun), but this should get you started.

Joel Coehoorn
A: 

I'd suggest modifying the table to keep track of Balance in its own column. That way, you only need to grab the most recent record to know where the employee stands.

That way, you can satisfy the simple case ("How much vacation time do I have"), while still being able to do the awkward rollup you're looking for in your "Which bits of vacation time don't line up with other bits" report, which I'd hope is something you don't need very often.

Jason Kester
I should have mentioned this is a vendor database (see edit above). Therefore, no table modifications are possible.
NYSystemsAnalyst
+1  A: 

I find your whole result set confusing and inaccurate and I can see employees sayng, "no I earned 2 hours on Jan 25th not 1." It is not true that they earned 1 hour on that date that was only partially offset, and you will have no end of problems if you choose to display this way. I'd look at a different way to present the information. Typically you either present a list of all leave actions (earned, expired and used) with a total at the bottom or you present a summary of available for use and used.

In over 30 years in the workforce and having been under many differnt timekeeping systems (as well as having studied even more when I was a managment analyst), I have never seen anyone want to display timekeeping information this way. I'm thinking there is a reason. If this is a requirement, I'd suggest pushing back on it and explaining how it will be confusing to read the data this was as well as being difficult to get a well-performing solution. I would not accept this as a requirement without trying to convince the client that it is a poor idea.

HLGEM
This is a vendor database, do I have no control over the structure of the data. This result set will not be displayed to most users. Some managers want to know how much vacation time a person has remaining and the exact dates of when it was earned.
NYSystemsAnalyst
+1  A: 

The following should do it..

(but as others mention, the best solution would be to adjust remaining vacations as they are spent..)

select 
    id, employeeid, date, category, 
    case 
    when  earned_so_far + hours - total_spent > hours then 
        hours 
    else 
        earned_so_far + hours - total_spent
    end as hours
from 
    (
                select 
                    id, employeeid, date, category, hours,
                    (
                        select 
                            isnull(sum(hours),0)
                        from 
                            vacations 
                        WHERE 
                            category = 'Vacation Earned' 
                            and 
                            date < v.date
                            and
                            employeeid = v.employeeid
                    ) as earned_so_far,
                    (
                        select
                            isnull(sum(hours),0)
                        from
                            vacations
                        where 
                            category = 'Vacation Used'
                            and 
                            employeeid = v.employeeid
                    ) as total_spent
                from 
                    vacations V
                where category = 'Vacation Earned'
    ) earned
where
    earned_so_far + hours > total_spent

The logic is

  1. calculate for each earned row, the hours earned so far
  2. calculate the total hours used for this user
  3. select the record if the total_hours_so_far + hours of this record - total_spent_hours > 0
Gaby
and of'course category should not be text, but some id from another table..
Gaby
It is, I just put in text in an effort to make the problem more clear.
NYSystemsAnalyst
+1 I tried working this problem, and my solution was so close to this one that it is not worth posting. I think this is the right approach.
Jeffrey L Whitledge
I modified this query a bit, but the idea of having a running total is what helped the most. Basically, I got the current balance, then used the running total idea to select records (going backward in time) until the running total exceeded the current balance. There are still come technicalities to work out, but this helped a lot.
NYSystemsAnalyst
@NYC, glad it helped point you in the right direction :) For the specific question, though, i think it should work out of the box.. did it not ?
Gaby