views:

129

answers:

3

Hi, I am trying to write a SQL Server query but have had no luck and was wondering if anyone may have any ideas on how to achieve my query.

What i'm trying to do:

I have a table with several columns naming the ones that i am dealing with TaskID, StatusCode, Timestamp. Now this table just holds tasks for one of our systems that run throughout the day and when something runs it gets a timestamp and the statuscode depending on the status for that task.

Sometimes what happens is the task table will be updated with a new timestamp but the statusCode will not have changed since the last update of the task so for two or more consecutive rows of a given task the statusCode can be the same. When i say consecutive rows i mean with regards to timestamp.

So example task 88 could have twenty rows at statusCode 2 after which the status code changes to something else.

Now what i am trying to do with no luck at the moment is to retrieve a list from this table of all the tasks and the statuscodes and the timestamps but in the case where i have more than one consecutive row for a task with the same statuscode i just want to take the first row with the lowest timestamp and ignore the rest of the row until the statuscode for that task changes.

To make it simpler in this case you can assume that i have a taskid which i am filtering on so i am just looking at a single task.

Does anyone have any ideas as to how i can do this or perhaps something that i coudl probably read to help me?

Thanks Irfan.

+1  A: 

Something like

select TaskID,StatusCode,Min(TimeStamp)
from table
group by TaskID,StatusCode
order by 1,2

Note that is statuscode can duplicate, you will need an additional field, but hopefully this can point you in the right direction...

Sparky
thanks for your reply Sparky. That is really close but the issue with this is in the table i only want to ignore the remaining row for a given status for a task if its consecutive rows following have the same statuscode. Eg. A task can go to statuscode 2 then to 3 then back to 2 and these are all valid rows i want to show as the statuscode 2 didn't happen consecutively. I think your query will give me the min timestamp for the given status on that task. This is why my query is so strange due to the consecutive rows issue.
Iffy
A: 

Something like the following should get you in the right direction....

CREATE TABLE #T
(
    TaskId INT
    ,StatusCode INT
    ,StatusTimeStamp DATETIME
)

INSERT INTO #T
SELECT 1, 1, '2009-12-01 14:20'
UNION SELECT 1, 2, '2009-12-01 16:20'
UNION SELECT 1, 2, '2009-12-02 09:15'
UNION SELECT 1, 2, '2009-12-02 12:15'
UNION SELECT 1, 3, '2009-12-02 18:15'

;WITH CTE AS
(
SELECT TaskId
     ,StatusCode
     ,StatusTimeStamp
     ,ROW_NUMBER() OVER (PARTITION BY TaskId, StatusCode ORDER BY TaskId, StatusTimeStamp DESC)  AS RNUM
FROM  #T
)
SELECT TaskId
     ,StatusCode
     ,StatusTimeStamp
FROM CTE
WHERE RNUM = 1

DROP TABLE #T
old account
Thanks Jay, this is a very smart query but again this has the same issue as the query above by Sparky where it groups the statuscode and taskid together and only returns one row for each status code. An example would be if i was to put the rowUNION SELECT 1, 2, '2009-12-02 19:15';into your query where this status code has already been inserted but this one is not consecutive to the others so should show. What the query above is group this row in with the other ones.
Iffy
I was thinking maybe of writing a cursor where i keep track of the previous statuscode and if its the same as the current i don't enter into my temp table else if its different then i do the entry into my temp table. At the end of this i will have a temp table with the data that i want be it with a slower query due to the cursor. This all depends though on how long the cursor takes to do this operation for a coupld hundred rows if i go that way?
Iffy
A: 

This are a couple ways of getting what you want:

SELECT
     T1.task_id,
     T1.status_code,
     T1.status_timestamp
FROM
     My_Table T1
LEFT OUTER JOIN My_Table T2 ON
     T2.task_id = T1.task_id AND
     T2.status_timestamp < T1.status_timestamp
LEFT OUTER JOIN My_Table T3 ON
     T3.task_id = T1.task_id AND
     T3.status_timestamp < T1.status_timestamp AND
     T3.status_timestamp > T2.status_timestamp
WHERE
     T3.task_id IS NULL AND
     (T2.status_code IS NULL OR T2.status_code <> T1.status_code)
ORDER BY
     T1.status_timestamp

or

SELECT
     T1.task_id,
     T1.status_code,
     T1.status_timestamp
FROM
     My_Table T1
LEFT OUTER JOIN My_Table T2 ON
     T2.task_id = T1.task_id AND
     T2.status_timestamp = (
          SELECT
               MAX(status_timestamp)
          FROM
               My_Table T3
          WHERE
               T3.task_id = T1.task_id AND
               T3.status_timestamp < T1.status_timestamp)
WHERE
     (T2.status_code IS NULL OR T2.status_code <> T1.status_code)
ORDER BY
     T1.status_timestamp

Both methods rely on there being no exact matches of the status_timestamp values (two rows can't have the same exact status_timestamp for a given task_id.)

Tom H.
Thanks Tom, i used the second query with some mods to work for my purpose but it did the trick and was really quick too.Thanks a lot for you help.
Iffy
you're welcome!
Tom H.