views:

41

answers:

1

Hi I am in need of a data storage type and algorithm for keeping track of the status of the last N items I have seen. Each item has a status of Pass or Fail, but the system I am monitoring will deemed to have failed if M items in a row have failed. Once the system is deemed to have failed I then need to scan back through the history of data and find the last window of width W in which all items had a "good" status.

For example, with a M=4 and W = 3:

    1 Good
    2 Good
    3 Good
    4 Good
    5 Good |
    6 Good |- Window of size 3 where all are good.
    7 Good |
    8 Bad
    9 Bad
    10 Good
    11 Good
    12 Bad
    13 Good
    14 Bad
    15 Bad
    16 Bad
    17 Bad  <== System is deemed bad at this point  So scan backwards to find "Good" window.

I know that this is going to end up in something like a regular expression search and have vague recollections of Knuth floating up out the dark recesses of my memory, so could anyone point me towards a simple introduction on how to do this? Also for what it is worth I will be implementing this in C# .Net 3.5 on a Windows XP system seeing 3GB of Ram (and an i7 processor - sniff the machine used to have Windows 7 and it does have 8GB of memory - but that was a story for TDWTF)

Finally I will be scanning numbers of items in the 100,000's to millions in any given run of this system. I won't need to keep track of the entire run, just the subset of all items until a system failure occurs. When that happens I can dump all my collected data and start the process all over again. However for each item I am tracking, I will have to keep at least the pass/fail status, and a 10 char string. So I am looking for suggestions on how to collect and maintain this data in the system as well. Although I am tempted to say - "screw it, it will all fit in memory even if the entire run pass with 100%, so its off to an array for you!"

Thanks for your help.

+5  A: 

I know that this is going to end up in something like a regular expression search
The problem is, actually, much simpler. We can take advantage of the fact that we're searching for subsequences consisting only of bad results (or only good results).

Something like this should work

// how many consecutive bad results we have at this point
int consecutiveFailures = 0;
// same for good results
int consecutivePasses = 0;
for each result
    if result == 'pass' then
        consecutiveFailures = 0;
        ++consecutivePasses;
    else if result == 'fail' then
        consecutivePasses = 0;
        ++consecutiveFailures;        
    end

    if consecutiveFailures == M
        // M consecutive failures, stop processing
        ...
    end
    if consecutivePasses >= W
        // record last set of W consecutive passes for later use
        ...
    end
end
Nikita Rybak
Thats what I get for being sleep deprived and working 14-16 hours a day for the last 2 weeks on this stupid project. D'oh it is simple. and don't get me started on the fact that I shouldn't even be solving this part myself /rant.
Peter M
@Peter Happens to everybody (More often than I care to admit)
Nikita Rybak