tags:

views:

132

answers:

2

I have a trace table in the following format.

CREATE TABLE [dbo].[trace](
    [trcId] [bigint] IDENTITY(1,1) NOT NULL,
    [trcDateTime] [datetime] NULL,
    [trcProgram] [nvarchar](150) NULL,
    [trcCode] [nvarchar](8) NULL,
    [trcText] [nvarchar](max) NULL,
    [trcXML] [nvarchar](max) NULL,
    [trcCorrGuid] [nvarchar](36) NULL, ...

Now I'm realizing that I can use selected rows from the trace to analyze response time. I'd like to create another table as follows:

CREATE TABLE [dbo].[executionHistory](
    trcCorrId [nvarchar](36) NOT NULL,
    startHIP datetime NOT NULL, 
    stopHIP datetime NOT NULL, 
    startOrch1 datetime NOT NULL, 
    stopOrch1 datetime NOT NULL, 
    startOrch2 datetime NOT NULL, 
    stopOrch2 datetime NOT NULL, 
    startWebMethoddatetime NOT NULL, 
    stopWebMethod  datetime NOT NULL,

Or alternatively could create a more generic:

  CREATE TABLE [dbo].[executionHistory](
    trcCorrId [nvarchar](36) NOT NULL,
    eventName [nvarchar](36) NOT NULL,
    eventStart datetime NOT NULL, 
    eventStop datetime NOT NULL

Baiscally I have a C# (HIP) program that calls a BizTalk orchestration (orch1 -published as a WCF web service). That orchestration calls orch2 which callas another WCF web service. I'd like to record/analyze/summarize response times from the above table instead of my trace table.

The start of the HIP program can be determined as follows: where trcProgram = 'HIP' and trcCode = '0250' and the end of the HIP program where trcProgram = 'HIP' and trcCode = '0299'.

I have similar pattern for each of the four milestones, for example: start Orch1: where trcProgram = 'Orch1' and trcCode = '0010' and end Orch1: where trcProgram = 'Orch1' and trcCode = '9999'.

Each execution is uniquely identified with the trcCorrGuid. This is a GUID that is stamped on all rows related to the same execution.

How can I do an elegant and efficient SQL query to load the executionHistory table from the trace table? I toyed around for a while but realized that I what I was writing could be really sloppy - and take several passes - and I needed advice from someone who might have done something like this.

I was going to begin by getting a distinct list of Guids. I was then going to do something like a CASE statement, but need a CASE that can handle two values, not one (trcProgram and trcCode). Maybe nested CASE statements would work?

Thanks in advance,

Neal Walters

Update:

Here's what I'm working on so far:

select trace.trcDateTime,
  EventName =
      CASE trace.trcCode  
         -- HIP 
         WHEN '0250' THEN 
             CASE trace.trcProgram 
                WHEN 'HIP:RCT.HIP.Components:Push' THEN 'Start:HIP'
                ELSE 'NA' 
             END 
         WHEN '0299' THEN
             CASE trace.trcProgram 
                WHEN 'HIP:RCT.HIP.Components:Push' THEN 'Stop:HIP'
                ELSE 'NA' 
             END 

         -- Orch 1
         WHEN '0010' THEN 
             CASE trace.trcProgram 
                WHEN 'Orch:WCFSubmitPolicyAction' THEN 'Start:Orch:WCFSubmitPolicyAction'
                WHEN 'Orch:CallRCTWebService' THEN 'Start:Orch:CallRCTWebService'
                ELSE 'NA' 
             END 
         WHEN '9999' THEN
             CASE trace.trcProgram 
                WHEN 'Orch:WCFSubmitPolicyAction' THEN 'Stop:Orch:WCFSubmitPolicyAction'
                WHEN 'Orch:CallRCTWebService' THEN 'Stop:Orch:CallRCTWebService'
                ELSE 'NA' 
             END 


         -- WebMethod Push 
         WHEN '1210' THEN 
             CASE trace.trcProgram 
                WHEN 'WebMethod:CreateValuationMinimal' THEN 'Start:WebMethod:CreateValuationMinimal'
                ELSE 'NA' 
             END 
         WHEN '1289' THEN
             CASE trace.trcProgram 
                WHEN 'Orch:CallRCTWebService' THEN 'Stop:WebMethod:CreateValuationMinimal'
                ELSE 'NA' 
             END 

         -- WebMethod Pull 
         WHEN '1950' THEN 
             CASE trace.trcProgram 
                WHEN 'WebMethod:ExportValuationRecordIdCustom' THEN 'Start:WebMethod:ExportValuationRecordIdCustom'
                ELSE 'NA' 
             END 
         WHEN '1951' THEN
             CASE trace.trcProgram 
                WHEN 'WebMethod:ExportValuationRecordIdCustom' THEN 'Stop:WebMethod:ExportValuationRecordIdCustom'
                ELSE 'NA' 
             END 

         ELSE 'NA' 
      END
from trace

I could store this to a temp table, and then work forward from there.

Based on Aaron's post, here's what I got working so far. I had to add the semi-colon after the "use" statement, and had to add "AS PivotTable" at the bottom.

use ESBSupport;
WITH trace_CTE AS
(
    SELECT trcCorrId, 
           trcProgram + trcCode AS trcUniqueCode, 
           trcDateTime
    FROM trace
    WHERE (trcProgram = 'HIP:RCT.HIP.Components:Push' AND trcCode IN ('0250', '0299'))
       OR (trcProgram = 'Orch:WCFSubmitPolicyAction'  AND trcCode IN ('0010', '9999'))
)

SELECT
    trcCorrId,
    [HIP:RCT.HIP.Components:Push0250] AS startHIP, 
    [HIP:RCT.HIP.Components:Push0299] AS stopHIP,
    [Orch:WCFSubmitPolicyAction0010] AS startOrch1, 
    [Orch:WCFSubmitPolicyAction9999] AS stopOrch1
    -- // etc., continue this for the other events
FROM trace_CTE 
PIVOT
(
    MIN(trcDateTime)

    FOR trcUniqueCode IN (
       [HIP:RCT.HIP.Components:Push0250], 
       [HIP:RCT.HIP.Components:Push0299],
       [Orch:WCFSubmitPolicyAction0010],
       [Orch:WCFSubmitPolicyAction9999]
    )
) as PivotTable
+1  A: 

I don't know if you would call this elegant or efficient, but it is probably the best you can do:

WITH trace_CTE AS
(
    SELECT trcCorrId, trcProgram + trcCode AS trcUniqueCode, trcDateTime
    FROM trace
    WHERE (trcProgram = 'HIP' AND trcCode IN ('0250', '0299'))
    OR (trcProgram = 'Orch1' AND trcCode IN ('0010', '9999'))
    OR ([more conditions here])
)
SELECT
    trcCorrId,
    [HIP0250] AS startHIP, [HIP0299] AS stopHIP,
    [Orch10010] AS startOrch1, [Orch19999] AS stopOrch1
    -- // etc., continue this for the other events
FROM trace_CTE
PIVOT
(
    MIN(trcDateTime)
    FOR trcUniqueCode IN
    (
        [HIP0250], [HIP0299],
        [Orch10010], [Orch19999],
        [(continue with other codes)]
    )
)

I'm using a CTE just to clean up the syntax, it won't actually create another pass. PIVOT is pretty efficient, although if your trace table is massive, this is still going to be slow.

In my experience, these problems are best handled with triggers whenever possible (context accumulation). Especially for what is probably set up as a write-only table, the overhead is minimal to do a few checks on INSERT, and it's going to be a lot more painful to do the same thing on-the-fly. If you don't do this, you'll probably end up with a nightly batch process crunching the numbers and putting them into an analysis table so users don't have to wait for the query to run.

Final note: In order to get any kind of performance whatsoever on this query, you are definitely going to need an index on (trcProgram, trcCode) that covers (trcCorrId, trcDateTime).

Aaronaught
Thanks. There's no requirement to do the stats. I'm just anticipating they might ask for it and it might come in handy during QA. I'll check out your code and the Pivot; I was thinking I might need a pivot even with the direction I was headed.
NealWalters
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon. - A brief google search seemed to indicate this might be related to "compatbility level". I'm using MS/SQL 2008. I've neer used the with statement or CTE, so I need to read up more on this.
NealWalters
As the error message says, if you have any T-SQL statements before this, you need to put a semicolon at the end. Better yet, just stick a semicolon right before the "WITH", as in `;WITH trace_CTE AS ...`.
Aaronaught
How does column [HIP0250] get created?
NealWalters
The `PIVOT` statement creates them. The name itself is a result of the concatenation in the CTE.
Aaronaught
Yes, I would say this was elegant. I basically got it working now, including computing responses times.
NealWalters
+1  A: 

Food for thought: NVARCHAR(36) takes 72 bytes of storage + length info (variable length guids?). Uniqueidentifier takes 16 bytes.

And you want to create an eventName nvarchar(36) based on... what? The trcProgram? That is nvarchar(150).

SELECT coalesce(start.trcCorrGuid, end.trcCorrGuid),
 coalesce(start.trcProgram, end.trcProgram) as eventName,
 start.trcDateTime as eventStart,
 end.trcDateTime as eventEnd
FROM (
  SELECT * FROM trace 
  WHERE trcCode IN ('0250', '0010', ...)) 
  as start
FULL JOIN (
  SELECT * FROM trace 
  WHERE trcCode IN ('0299', '9999', ...)) 
  as end ON start.trcCorrGuid = end.trcCorrGuid 
     AND start.trcProgram = end.trcProgram;

This query uses a full join between start and end events to allow for any trace inaccuracies (correlations that miss either a stop either a start). The performance of the query will depend on what indexes are on [trace]. If all you have is a primary key of trcID then no query can do miracles, is better you write a cursor loop then. The query in my example would need an index on trcCorrGuid at least, and preferably one on (trcCorrGuid) include (trcCode, trcProgram)

Remus Rusanu
Interesting. I could add the indexes, but I think the join would result in a Cartesian product - there are dozens of traces for the same trcProgram. The trcCodes are not unique at all, and set by the programmer. So I know that for one specific trcProgram, 0250 is the start, and 0299 is the end. But another program might use 0250 or 0299 for something totally different. If I had written a "start" or "stop" column into my trace, then I think I could make this work. I was thinking about a cursor...
NealWalters
For each pair {trcCorrGuid, trcProgram} there is only one start and one stop record, right? The the full join will do just fine. As for filtering out start/stop codes I'd recommend creating a lookup table of (trcProgram, trcCode, isStart, isStop) and then using this table as a join in the inner queries to filter start and stop events.
Remus Rusanu