views:

582

answers:

4

Hopefully I can do the problem justice, because it was too difficult to summarise it in the title! (suggestions are welcome in the comments)

Right, so here's my table:

Tasks
  task_id        (number)
  job_id         (number)
  to_do_by_date  (date)
  task_name      (varchar / text)
  status         (number)
  completed_date (date)

for arguments sake let's make the values of status:

1 = New
2 = InProgress
3 = Done

and what I'm having trouble trying to do is create a query that pulls back all of the tasks:

  • where any of the tasks for a job_id have a status <> Done
    • except where all tasks for a job_id are are done, but one or more tasks have a completed_date of today
  • ordered by the to_be_done_by date, but grouping all of the job_id tasks together
    • so the job_id with the next `to_do_by_date' task is shown first

some information about the data:

  • a job_id can have an arbitrary number of tasks


Here's an example of the output I'm trying to get:

task_id   job_id   to_do_by_date   task_name   status   completed_date
   1        1        yesterday        -          3        yesterday
   2        1        today            -          3        today
   3        2        now              -          3        today
   4        2        2 hours time     -          2        {null}
   5        2        4 hours time     -          2        {null}
   6        2        tomorrow         -          1        {null}
   7        3        3 hours time     -          2        {null}
   8        3        tomorrow         -          1        {null}
   9        3        tomorrow         -          1        {null}


I'm using Oracle 10g, so answers for Oracle or ANSI SQL, or a hint for how to approach this would be ideal, and I can create Views or wrap this in a Stored Procedure to offload logic from the application if your solution calls for it.


here's a sql script that will create the example test data shown above:

create table tasks (task_id number, job_id number, to_do_by_date date, task_name varchar2(50), status number, completed_date date);
insert into tasks values (0,0,sysdate -2,    'Job 0, Task 1 - dont return!', 3, sysdate -2);
insert into tasks values (1,1,sysdate -1,    'Job 1, Task 1', 3, sysdate -1);
insert into tasks values (2,1,sysdate -2/24, 'Job 1, Task 2', 3, sysdate -2/24);
insert into tasks values (3,2,sysdate,       'Job 2, Task 1', 3, sysdate);
insert into tasks values (4,2,sysdate +2/24, 'Job 2, Task 2', 2, null);
insert into tasks values (5,2,sysdate +4/24, 'Job 2, Task 3', 2, null);
insert into tasks values (6,2,sysdate +1,    'Job 2, Task 4', 1, null);
insert into tasks values (7,3,sysdate +3/24, 'Job 3, Task 1', 2, null);
insert into tasks values (8,3,sysdate +1,    'Job 3, Task 2', 1, null);
insert into tasks values (9,3,sysdate +1,    'Job 3, Task 3', 1, null);
commit;


Many, many thanks for your help :o)

A: 

Given your requirements, it's not obvious to me why job_id 2 should be returned in your results. There is one task with a status of Done, so it fails the first criteria

all of the tasks for a job_id have a status <> Done

And there are tasks with a status other than Done, so it fails the second criteria

except where all tasks for a job_id are are done, but one or more tasks have a completed_date of today

Is there some other reason that job_id = 2 should be included?

SQL> ed
Wrote file afiedt.buf

  1  select task_id, job_id, to_do_by_date, task_name, status, completed_date
  2    from tasks t1
  3   where not exists( select 1
  4                       from tasks t2
  5                      where t1.job_id = t2.job_id
  6                        and t2.status  = 3)
  7      or ((not exists( select 1
  8                        from tasks t3
  9                       where t1.job_id  = t3.job_id
 10                         and t3.status != 3))
 11          and
 12          exists (select 1
 13                    from tasks t4
 14                   where t1.job_id = t4.job_id
 15                     and trunc(t4.completed_date) = trunc(sysdate)))
 16*   order by job_id, to_do_by_date
SQL> /

   TASK_ID     JOB_ID TO_DO_BY_ TASK_NAME           STATUS COMPLETED
---------- ---------- --------- --------------- ---------- ---------
         1          1 28-OCT-08 Job 1, Task 1            3 28-OCT-08
         2          1 29-OCT-08 Job 1, Task 2            3 29-OCT-08
         7          3 29-OCT-08 Job 3, Task 1            2
         8          3 30-OCT-08 Job 3, Task 2            1
         9          3 30-OCT-08 Job 3, Task 3            1
Justin Cave
great point, and sorry for the confusion caused, but that should have read 'any of the tasks for job_id are NOT Done' - i.e. if there are any tasks that are outstanding for the job_id, then all tasks for the job_id should be shown
Andrew
I've updated the question following your answer. Thanks for answering and please accept my apologies for not making my intention clear :o/
Andrew
A: 

I don't do Oracle, and I don't have a Sql Server handy - but this should get you fairly close.

SELECT Tasks.*
FROM Tasks
JOIN (
   --Undone
   SELECT Job_Id
   FROM Tasks
   WHERE
     Status <> 3
   UNION
   --Done today
   SELECT Job_Id
   FROM Tasks
   WHERE
     Status = 3
     AND Completed_Date = TODAY()
) as UndoneOrDoneToday ON
   Tasks.Job_Id = UndoneOrDoneToday.Job_Id
JOIN (
   SELECT Job_Id, MIN(to_do_by_date) as NextToDoByDate
   FROM Tasks
   GROUP BY Job_id
) as NextJob ON
   Tasks.Job_Id = NextJob.Job_id
ORDER BY
   NextJob.NextToDoByDate, 
   Tasks.Job_Id, --If NextToDoByDate isn't unique, this should order jobs together
   Tasks.to_do_by_date, --This may not be needed, but would put eg., task 7 due today higher than task 6 due tomorrow
   Tasks.Task_Id --this should be last

Edit: Most other answers seem to sort by job_id, to_do_by. That looks to work for the example data, but does not meet the requirements of:

ordered by the to_be_done_by date, but grouping all of the job_id tasks together so the job_id with the next to_do_by_date task is shown first

Mark Brackett
+2  A: 

Obviously you will have to fix this up a bit but I hope you get the idea.

SELECT 
    task_id, job_id, to_do_by_date, task_name, status, completed_date
FROM
    Tasks
WHERE
    job_id IN (
     SELECT job_id 
     FROM Tasks 
     WHERE status <> 'Done' 
     GROUP BY job_id)
    OR
    job_id IN (
     SELECT job_id 
     FROM Tasks 
     WHERE status = 'Done' AND completed_date = 'Today'
      AND job_id NOT IN (SELECT job_id FROM Tasks WHERE status <> 'Done' GROUP BY job_id)
     GROUP BY job_id)
ORDER BY
    job_id, to_do_by_date
DJ
Works! Thanks you! :oD
Andrew
+2  A: 

I agree with Justin -- I don't get why 2 is returned.

Here's a query using analytic functions to return the right rows according to the logic description.

select * from
(
select t.*,
       min(status) over (partition by job_id) min_status_over_job,
       max(status) over (partition by job_id) max_status_over_job,
       sum(case when trunc(completed_date) = trunc(sysdate)-1 then 1 else 0 end) 
                   over (partition by job_id) num_complete_yest
from   tasks t
)
where  max_status_over_job < 3
       or (min_status_over_job = 3 and num_complete_yest > 0)
/
David Aldridge
sorry, that was my fault where I didn't put my intention into words correctly. I've updated that part of the question :o/
Andrew