A: 

This is just off the top of my head and is untested, so it may need some tweaking:

SELECT DISTINCT
     T.TransactionID,
     T.ItemID,
     T.Code,
     T.EffectiveDate,
     T.CreateDate
FROM
     My_Table T
INNER JOIN (
     SELECT
          T1.TransactionID,
          T2.TransactionID,
          T3.TransactionID
     FROM
          My_Table T1
     INNER JOIN My_Table T2 ON
          T2.ItemID = T1.ItemID AND
          T2.Code = 61 AND
          T2.EffectiveDate > T1.EffectiveDate
     INNER JOIN My_Table T3 ON
          T3.ItemID = T1.ItemID AND
          T3.Code = 9 AND
          T3.EffectiveDate > T2.EffectiveDate
     WHERE
          T1.Code = 51
     ) SQ ON
     SQ.TransactionID = T1.TransactionID OR
     SQ.TransactionID = T2.TransactionID OR
     SQ.TransactionID = T3.TransactionID
Tom H.
Obviously this would be limited in the sense that you are hard-coding the number of sequences you are looking for and would require you to continue to append subqueries/joins as you scaled the number of sequences you want to pull out.
chadhoc
+1  A: 

A DB approach is ideal if it's simple (i.e. no cursors or overly complicated stored procedure)

I don't believe a pure DB approach ("pure" meaning only using SQL SELECT) is practical because the type of SQL I envision would require very convoluted self-joins, field concatenation, MAX() functions, etc. This type of SQL might be a fun academic answer to a puzzle in Joe Celko's "SQL for Smarties" book but I don't think that's appropriate for production code.

I think the realistic approach is to write some kind of loop that keeps track of state. Your problem in the general sense is very similar to writing code for stateful inspection of TCPIP packets for spam filtering or scanning credit-card transactions for fraudulent patterns. All these problems share similar characteristics: the actions you take on the current row(record) depends on what records you saw previously (the context)... and that aspect requires holding state variables.

If you want to avoid round-tripping the data for analysis, it looks like Transact-SQL is the best way for performance. Or use hosted CLR to take advantage of C# syntax while still keeping the processing within the database engine.

JasDev
+1  A: 

Actually, you could get a couple of fairly simple solutions leveraging ranking/windowing functions and/or CTEs and recursive CTEs.

Create a procedure that accepts a character-based comma-separated list of Code values you are looking for in the sequence you want them in - use any of a dozen possible ways to split this list into a table/set that is made up of the sequence and Code value, resulting in a table with a structure like this:

declare @sequence table (sequence int not null, Code int not null);

Once you have this, it's simply a matter of sequencing the source set based on joining the sequenced table to the source table on the same Code values for a given ItemId - once you have the source set filtered and sequenced, you can simply join again based on the matching sequence values - this is sounding complex, but in reality it would be a single query like this:

with srcData as (
 select row_number() over(order by t.EffectiveDate) as rn,
   t.TransactionId, t.ItemId, t.Code, t.EffectiveDate, t.CreateDate
 from #TableName t
 join @sequence s
 on  t.Code = s.Code
 where t.ItemId = @item_id
)
select d.TransactionId, d.ItemId, d.Code, d.EffectiveDate, d.CreateDate
from srcData d
join @sequence s
on  d.rn = s.sequence
and  d.Code = s.Code
order by d.rn;

This alone won't guarantee that you get a result-set that is identical to what you are looking for, but staging the data into a temp table and adding a few simple checks around the code would do the trick (for example, add a checksum validation and a sum of the code values)

declare @tempData table (rn int, TransactionId smallint, ItemId smallint, Code smallint, EffectiveDate datetime, CreateDate datetime);

with srcData as (
 select row_number() over(order by t.EffectiveDate) as rn,
   t.TransactionId, t.ItemId, t.Code, t.EffectiveDate, t.CreateDate
 from #TableName t
 join @sequence s
 on  t.Code = s.Code
 where t.ItemId = @item_id
)
insert @tempData
  (rn, TransactionId, ItemId, Code, EffectiveDate, CreateDate)
select d.rn, d.TransactionId, d.ItemId, d.Code, d.EffectiveDate, d.CreateDate
from srcData d
join @sequence s
on  d.rn = s.sequence
and  d.Code = s.Code;

-- Verify we have matching hash/sums 
if
(
 ( (select sum(Code) from @sequence) = (select sum(Code) from @tempData) )
 and
 ( (select checksum_agg(checksum(sequence, Code)) from @sequence) = (select checksum_agg(checksum(rn, Code)) from @tempData) )
)
begin;
 -- Match - return the resultset
 select d.TransactionId, d.ItemId, d.Code, d.EffectiveDate, d.CreateDate
 from @tempData d
 order by d.rn;

end;

If you want to do it all inline, you could use a different approach leveraging CTEs and recursion to perform a running sum/total and OrdPath-like comparison as well (though you'd still need to parse the sequence character data out into a dataset)

-- Sequence data with running total
with sequenceWithRunningTotal as
(
 -- Anchor
 select s.sequence, s.Code, s.Code as runningTotal, cast(s.Code as varchar(8000)) as pth,
   sum(s.Code) over(partition by 1) as sumCode
 from @sequence s
 where s.sequence = 1
 -- Recurse
 union all
 select s.sequence, s.Code, b.runningTotal + s.Code as runningTotal,
   b.pth + '.' + cast(s.Code as varchar(8000)) as pth,
   b.sumCode as sumCode
 from @sequence s
 join sequenceWithRunningTotal b
 on  s.sequence = b.sequence + 1
),
-- Source data with sequence value
srcData as 
(
 select row_number() over(order by t.EffectiveDate) as rn,
   t.TransactionId, t.ItemId, t.Code, t.EffectiveDate, t.CreateDate,
   sum(t.Code) over(partition by 1) as sumCode
 from #TableName t
 join @sequence s
 on  t.Code = s.Code
 where t.ItemId = @item_id
),
-- Source data with running sum
sourceWithRunningSum as
(
 -- Anchor
 select t.rn, t.TransactionId, t.ItemId, t.Code, t.EffectiveDate, t.CreateDate,
   t.Code as runningTotal, cast(t.Code as varchar(8000)) as pth,
   t.sumCode
 from srcData t
 where t.rn = 1
 -- Recurse
 union all
 select t.rn, t.TransactionId, t.ItemId, t.Code, t.EffectiveDate, t.CreateDate,
   s.runningTotal + t.Code as runningTotal,
   s.pth + '.' + cast(t.Code as varchar(8000)) as pth,
   t.sumCode
 from srcData t
 join sourceWithRunningSum s
 on  t.rn  = s.rn + 1
)
select d.TransactionId, d.ItemId, d.Code, d.EffectiveDate, d.CreateDate
from sourceWithRunningSum d
join sequenceWithRunningTotal s
on  d.rn = s.sequence
and  d.Code = s.Code
and  d.runningTotal = s.runningTotal
and  d.pth = s.pth
and  d.sumCode = s.sumCode
order by d.rn;
chadhoc