tags:

views:

239

answers:

3

Hi,

We use an online project management system, and I'm trying to extend it somewhat.

It has the following tables of interest:

todo_itemStatus:
+--------------+-----------------------+------+-----+---------------------+----------------+
| Field        | Type                  | Null | Key | Default             | Extra          |
+--------------+-----------------------+------+-----+---------------------+----------------+
| itemStatusId | bigint(20) unsigned   | NO   | PRI | NULL                | auto_increment |
| itemId       | int(10) unsigned      | NO   | MUL | 0                   |                |
| statusDate   | datetime              | NO   |     | 0000-00-00 00:00:00 |                |
| statusKey    | tinyint(3) unsigned   | NO   |     | 0                   |                |
| memberId     | mediumint(8) unsigned | NO   |     | 0                   |                |
+--------------+-----------------------+------+-----+---------------------+----------------+

This table keeps track of when a task is complete, and also keeps the status of all task changes.

There's then a project table, and an 'item' (or task) table.

I basically want to be able to extract a list of projects, with details on the percentage of tasks complete. However, for now I'd be happy if I could just list each task in a project with details on whether they're complete.

As far as I'm aware, the best way to get the most recent status of a task is to choose an todo_itemStatus where the statusDate is the newest, or the itemStatusId is the largest whilst itemId equals the task I'm interested.

I tried a query like this:

<pre>
select todo_item.itemId, todo_item.title, todo_itemStatus.statusKey, todo_itemStatus.statusDate 
from todo_item, todo_project, todo_itemStatus 
where todo_item.projectId  = todo_project.projectId 
and todo_project.projectId = 13 
and todo_itemStatus.itemId = todo_item.itemId 
and todo_itemStatus.statusDate = (
    select MAX(todo_itemStatus.statusDate) 
    from todo_itemStatus key1 where todo_itemStatus.itemId  = key1.itemId);
</pre>

However, this yields all status updates with output like this:

+--------+-----------------------------------------------------------------------------+-----------+---------------------+
| itemId | title                                                                       | statusKey | statusDate          |
+--------+-----------------------------------------------------------------------------+-----------+---------------------+
|    579 | test complete item - delete me                                              |         1 | 2009-07-28 13:04:38 |
|    579 | test complete item - delete me                                              |         0 | 2009-07-28 14:12:12 |
+--------+-----------------------------------------------------------------------------+-----------+---------------------+

Which isn't what I want, as I only want one task entry returning with the statusKey / statusDate from the most recent entry in the todo_itemStatus table.

I know I've been a bit vague in my description, but I didn't want to write a massively long message. I can provide much more detail if necessary.

Please can someone suggest what I'm doing wrong? It's been a long time since I've done any real database stuff, so I'm a bit unsure what I'm doing wrong here...

Many thanks! Dave

A: 

I think a where clause of nested query is unnecessary.

adatapost
+1  A: 

You should look into using the DISTINCT keyword (Microsoft SQL Server)

EDIT: I've just re-read your question and I think that the GROUP BY clause is more suited in this situation. You should read http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/, however essentially what you need to do is first select the columns that you are interested in using a GROUP BY clause:

SELECT todo_itemStatus.itemStatusId, MAX(todo_itemStatus.statusDate)
FROM todo_item, todo_project, todo_itemStatus 
WHERE todo_item.projectId  = todo_project.projectId 
AND todo_itemStatus.itemId = todo_item.itemId 
AND todo_project.projectId = 13 
GROUP BY itemStatusId

We then self-join to this set of id's to get the rest of the columns we are interested in:

SELECT
    todo_item.itemId, 
    todo_item.title, 
    todo_itemStatus.statusKey, 
    todo_itemStatus.statusDate 
FROM todo_item
JOIN todo_itemStatus
ON todo_itemStatus.itemId = todo_item.itemId 
JOIN
    (SELECT todo_itemStatus.itemStatusId, MAX(todo_itemStatus.statusDate)
    FROM todo_item, todo_project, todo_itemStatus 
    WHERE todo_item.projectId  = todo_project.projectId 
    AND todo_itemStatus.itemId = todo_item.itemId 
    AND todo_project.projectId = 13 
    GROUP BY itemStatusId) AS x
ON todo_itemStatus.itemStatusId = x.itemStatusId
Kragen
Hi, thanks, but I don't think DISTINCT will do what I want. The problem is that their are multiple todo_itemStatus entries per todo_item, and I just want to get the most recent. I could be wrong though... but a quick test just yielded an error...
DaveS
Yeah, just re-read your question and I think that actually a GROUP BY is whats needed - trying to come up with something for you atm.
Kragen
Thanks for that. I'll take a look at that article and then re-read your post and let you know how I get on. Thank you!
DaveS
Thanks for your help. Unfortunately your suggestion didn't work, but with your help and that webpage you referenced I managed to work it out. Thanks!
DaveS
A: 

I've experimented some more and the following query does what I want:

select todo_item.itemId, todo_item.title, todo_itemStatus.statusKey, todo_itemStatus.statusDate from todo_itemStatus, todo_item where todo_item.itemId = todo_itemStatus.itemId and todo_item.projectId = 13 and todo_itemStatus.statusDate = (select MAX(status.statusDate) from todo_itemStatus  as status where status.itemId = todo_item.itemId);

So I'm now happy. Thanks for all the help and the suggestions.

Dave.

DaveS