views:

236

answers:

3

Hey there,

I need help with a oracle query. Excuse me in advance for my english.

Here is my setup:

I have 2 tables called respectively "tasks" and "timesheets". The "tasks" table is a recursive one, that way each task can have multiple subtasks. Each timesheet is associated with a task (not necessarily the "root" task) and contains the number of hours worked on it.

Example:

Tasks

id:1 | name: Task A | parent_id: NULL

id:2 | name: Task A1 | parent_id: 1

id:3 | name: Task A1.1 | parent_id: 2

id:4 | name: Task B | parent_id: NULL

id:5 | name: Task B1 | parent_id: 4

Timesheets

id:1 | task_id: 1 | hours: 1

id:2 | task_id: 2 | hours: 3

id:3 | task_id:3 | hours: 1

id:5 | task_id:5 | hours:1 ...

What I want to do:

I want a query that will return the sum of all the hours worked on a "task hierarchy". If we take a look at the previous example, It means I would like to have the following results:

task A - 5 hour(s) | task B - 1 hour(s)

At first I tried this

SELECT TaskName, Sum(Hours) "TotalHours" 
FROM (
    SELECT replace(sys_connect_by_path(decode(level, 1, t.name), '~'), '~') As TaskName, 
    ts.hours as hours
    FROM tasks t INNER JOIN timesheets ts ON t.id=ts.task_id
    START WITH PARENTOID=-1
    CONNECT BY PRIOR t.id = t.parent_id
    )
GROUP BY TaskName Having Sum(Hours) > 0 ORDER BY TaskName

And it almost work. THe only problem is that if there are no timesheet for a root task, it will skip the whole hieararchy... but there might be timesheets for the child rows and it is exactly what happens with Task B1. I know it is the "inner join" part that is causing my problem but I'm not sure how can I get rid of it.

Any idea how to solve this problem?

Thank you

A: 

If you use left outer join instead of normal join, you may get the output.

SELECT TaskName, Sum(Hours) "TotalHours"  
FROM ( 
    SELECT replace(sys_connect_by_path(decode(level, 1, t.name), '~'), '~') As TaskName,  
    ts.hours as hours 
    FROM tasks t,timesheets ts where t.id=ts.task_id(+) 
    START WITH PARENTOID=-1 
    CONNECT BY PRIOR t.id = t.parent_id 
    ) 
GROUP BY TaskName Having Sum(Hours) > 0 ORDER BY TaskName 
Bharat
I think he would have to reverse the orders of the tables too, because he wants tasks which don't have corresponding timesheets. Currently, if he changes to left outer, he'd get timesheets which have no corresponding tasks.
FrustratedWithFormsDesigner
FrustratedWithFormsDesigner: You are correct or he can also use right outer join
Bharat
A: 

Would something like this work? I've had cases similar to yours, and I simply removed the join from the hierarchical query and applied it only afterward to avoid losing rows.

SELECT TaskName, Sum(ts.hours) "TotalHours" 
FROM (
    SELECT replace(sys_connect_by_path(decode(level, 1, t.name), '~'), '~') As TaskName, t.id
    FROM tasks t
    START WITH PARENTOID=-1
    CONNECT BY PRIOR t.id = t.parent_id
    ) tasks
INNER JOIN timesheets ts ON tasks.id=ts.task_id
GROUP BY TaskName Having Sum(ts.hours) > 0 ORDER BY TaskName
RenderIn
Thanks a million! I think this is doing the trick. Some more testing to do and I'll flag this as the official answer.
François
A: 

Have you tried this?

SELECT TaskName, Sum(Hours) "TotalHours" 
FROM (
    SELECT replace(sys_connect_by_path(decode(level, 1, t.name), '~'), '~') As TaskName, 
    ts.hours as hours
    FROM timesheets ts  LEFT OUTER JOIN tasks t  ON t.id=ts.task_id
    START WITH PARENTOID=-1
    CONNECT BY PRIOR t.id = t.parent_id
    )
GROUP BY TaskName Having Sum(Hours) > 0 ORDER BY TaskName
FrustratedWithFormsDesigner
Thanks for your help! I already tried the left outer join option but it slows down the query tremendously. Perhaps I should take another look at the indexes in those tables... maybe there is one missing somewhere. But for the moment the solution proposed by RenderLn is what I am looking for. Again, thanks for your help.
François
The index that I would think of (just by looking at this query) is on `timesheets`, index of `task_id` (Assuming that tasks.id is already an index). But yeah, in general, outer joins are slower than inner.
FrustratedWithFormsDesigner