views:

104

answers:

3

Hello,

I have the following tables:

PROJECTS - project_id, name
EMPLOYEES - employee_id, name
SALARY - employee_id, date, per_hour
HOURS - log_id, project_id, employee_id, date, num_hours

I need to query how much a project is costing. Problem is that Salary can vary. For example, a person can get a raise.

The SALARY table logs the per_hour charge for an employee. With every change in cost being recorded with its date.

How can I query this information to make sure that the the log from the HOURS table is always matched to the right entry from the SALARY table. Right match being.. depending on the date of the hours log, get the row from the salary table with the highest date before the log's date.

ie.. if the work was performed on Feb 14th. Get the row for this employee from the Salary table with the highest date.. but still before the 14th.

Thank you,

A: 

In the 'Hours' table actually log the value of the salary that you use (don't link it based on ID). This will give you more flexibility in the future.

Noon Silk
A: 

I have found the easiest way to handle queries spanning dates like this is to store a StartDate and an EndDate, where the EndDate is NULL for the current salary. I use a trigger to make sure there is only ever one NULL value for EndDate, and that there are no overlapping date ranges, or gaps between the ranges. StartDate is made not nullable, since that is never a valid value.

Then your join is pretty simple:

select h.num_hours, s.per_hour
from hours h
inner join salary s on h.employee_id = s.employee_id 
    and h.date >= s.StartDate and (h.date <= s.EndDate or s.EndDate is null)
RedFilter
@cletus correctness is more important than scalability
Phil Wallach
@cletus This is a payroll system, which means it calculates how much to pay people. It is the money they live on. I would suspect that correctness is fairly important, and that any company that had a scalability issue with their payroll would be able to afford the hardware to process their payroll correctly. It is was a game or Facebook or similar, sure, the tradeoffs would be different.
Phil Wallach
@cletus: *"Premature optimization is the root of all evil"* is another mantra; you are recasting the issue to be about performance rather than providing a solution to the question, which is how to query against historical salaries.
RedFilter
@cletus: It is disappointing that you chose to delete your answer and comments (along with other users' comments on your answer), and your comments on my answer, rather than either standing behind your words or acknowledging that your solution had some flaws. Healthy debate is a valuable part of SO (and the learning process), and there is merit in the discussion regardless of which perspective is ultimately preferred by the SO community.
RedFilter
+2  A: 

What you need is an end date on SALARY. When a new record is inserted into SALARY for an employee, the previous record with the highest date (or better yet, a current flag set to 'Y' as recommended by cletus) should have its end date column set to the same date as the start date for the new record.

This should work with your current schema but be aware that it may be slow.

SELECT
  SUM(h.num_hours * s.per_hour) AS cost
FROM PROJECTS p
INNER JOIN HOURS h
  ON p.project_id = h.project_id
INNER JOIN (
    SELECT
      s1.employee_id,
      s1.date AS start_date,
      MIN(s2.date) AS end_date
    FROM SALARY s1
    INNER JOIN SALARY s2
      ON s1.employee_id = s2.employee_id
      AND s1.date < s2.date
    GROUP BY
      s1.employee_id,
      s1.date) s
  ON h.employee_id = s.employee_id
  AND h.date >= s.start_date
  AND h.date < s.end_date
lins314159