views:

1197

answers:

4

Here is the situation:

I have a database of 'tickets', and we track changes to the tickets each time they are saved. I am specifically looking at status changes, which track with the following format:

STATUS:{FROM}:{TO}

with {FROM} and {TO} changing to the respective statuses. What I need to do is generate numbers by weeks of the amount of tickets that were 'open' (meaning in draft status) at the end of any given week, say for the past 12 weeks. However, you are not limited to 'closing' a ticket and then reopening it, or making multiple changes in a single week.

So, what I need to do is modify the SQL below to ONLY consider the most recent "action" for any given entry. This way we avoid the problem of having entries that were 'closed' appear in the open count because they had been opened earlier.

SELECT track.historyID
      FROM RS_HistoryTracker track 
     WHERE (track.action = 'STATUS:INITIAL:DRAFT'
            OR track.action = 'STATUS:DELETED:DRAFT'
            OR track.action = 'STATUS:DRAFT:DRAFT')
       AND track.trackDateTime <= @endOfWeek

However, this statement is contained within another select statement, and is used to generate a complete list of history items:

    SELECT COUNT(DISTINCT his.historyID) AS theCount
  FROM RS_History his
 WHERE his.historyID IN 
       (SELECT track.historyID
          FROM RS_HistoryTracker track 
         WHERE (track.action = 'STATUS:INITIAL:DRAFT'
                OR track.action = 'STATUS:DELETED:DRAFT'
                OR track.action = 'STATUS:DRAFT:DRAFT')
           AND track.trackDateTime <= @endOfWeek)

So how do I make the inner select consider only the most recent tracked 'action' that occured up to or on the endOfWeek date? HistoryTracker contains a datetime stamp column.

+1  A: 

As a starter you can find the last history item for each ticket by doing something like this

select * from
(
    --find max history id for each ticket
    select 
        T1.ticketId, 
        max(T1.historyId) As LastHistoryId
    from #Ticket T1
    --add WHERE clause to filter out dates
    group by 
       T1.ticketId
) MaxTicket

inner join 
#Ticket T2 --find the ticket so you can get the status
on MaxTicket.ticketId = T2.ticketId 
and MaxTicket.LastHistoryId=T2.Historyid

You may want to change how you find the latest ticket to be based on the date rather than the history id.

pjp
Added a 'LIKE' clause to filter out the updates that don't deal with status. Working on date. Thanks for the suggestion
Philip Harris
historid is a foreign key - there's no value in doing aggregate functions on it, they will always be the same.
OMG Ponies
+1  A: 

The are many variations of this question floating around on stackoverflow, here is the first I found :)

sql-query-to-get-most-recent-row-for-each-instance-of-a-given-key

Essentially you do need to do it in two parts.
- Use one query to find the most recent timestamp per item
- Use another query to do the work you set out to do

To Find all items still open by a given date:

SELECT
  [data].*
FROM
  track AS [data]
WHERE
  [data].trackDateTime =
    (
       SELECT
          MAX(trackDateTime)
       FROM
          track
       WHERE
          track.ticketID = [data].ticketID
          AND track.DateTime < @endOfWeek
    )
  AND track.action IN ('STATUS:INITIAL:DRAFT','STATUS:DELETED:DRAFT','STATUS:DRAFT:DRAFT'))

This assumes that ticketID the unique identifier for each ticket. (Based on one of your comments)

Dems
+1  A: 

Will work with SQL Server 2005+:

WITH history AS (
  SELECT rh.historyID,
         MAX(rh.action) 'action'
    FROM RS_HISTORYTRACKER rh 
   WHERE rh.action IN ('STATUS:INITIAL:DRAFT', 'STATUS:DELETED:DRAFT', 'STATUS:DRAFT:DRAFT')
     AND rh.trackDateTime <= @endOfWeek)
SELECT COUNT(DISTINCT t.historyID) AS theCount
  FROM RS_HISTORY t
  JOIN history h ON h.historyi = t.historyid

Alternate, non-CTE using query:

SELECT COUNT(DISTINCT t.historyID) AS theCount
  FROM RS_HISTORY t
  JOIN (SELECT rh.historyID,
            MAX(rh.action) 'action'
          FROM RS_HISTORYTRACKER rh 
         WHERE rh.action IN ('STATUS:INITIAL:DRAFT', 'STATUS:DELETED:DRAFT', 'STATUS:DRAFT:DRAFT')
          AND rh.trackDateTime <= @endOfWeek) h ON h.historyi = t.historyid
OMG Ponies
I still have the issue that this doesn't seem to pick up if the most recent status change was STATUS:%:FINAL, which would not include it. This is a good start though, and has me thinking.
Philip Harris
How can it be including FINAL if the IN clause only gets INITIAL:DRAFT, DELETED:DRAFT or DRAFT:DRAFT? Or is final an column/attribute on the RS_HISTORY table?
OMG Ponies
Sorry, you're right. I was looking at the wrong code in my window.
Philip Harris
A: 

Looks like this works:

  SELECT query.historyID
    FROM
    (SELECT MAX(track.trackID) AS maxTrackID, track.historyID
    FROM RS_HistoryTracker track
    WHERE track.trackDateTime <= '2009-08-06 23:59:59'
      AND track.action LIKE 'STATUS:%'
    GROUP BY historyID) AS query, RS_HistoryTracker track
    WHERE track.historyID = query.historyID
      AND track.trackID = query.maxTrackID
      AND track.action LIKE 'STATUS:%:DRAFT'
Philip Harris
Thanks for all of the suggestions - I took bits and part of each and integrated them into this.
Philip Harris
Although the LIKE operator can give you a shortcut to make shorter code; seriously, don't use it if you can avoid it. The reason being that it prevents the optimiser from using indexes, generally killing performance.
Dems
Also, just a general comment when using Date Ranges, it's better to use ">= @start" and "< @end". So if you want everything up to the end of 6th August, rephrase that as everything before the 7th August. The reason being that the logic holds true no matter what data type you're using. In your "<= @end" example, what if the time stamp is a few milliseconds after 23:59:59? And if TrackDateTime is a SmallDateTime you only need to use 23:59 without the seconds. Using "< @end" makes all those considerations go away, plus much much more in other queries.
Dems
Thanks for the comments. We are using the day after as our end date in production. The LIKE is an unfortunate need, as each set of ticket collections uses different statuses, but always uses draft. (the database is horribly not normalized, but I'm not the developer for it).
Philip Harris