For a particular project, we acquire data for a number of events and collect variables about those events at the same time. After the data has been collected, we perform a user-customizable analysis on said data to determine whatever it is that the user is interested in.
The data is collected in a form similar to this:
Timestamp Event 0 x = 0 0 y = 1 3 Event A occurred 3 x = 1 4 Event A occurred 4 x = 2 9 Event B occurred 9 y = 2 9 x = 0
To understand the entire state at any time, the most straightforward approach is to walk over the entire set of data. For example, if I start at time 0, and "analyze" until timestamp 5, I know that at that point x = 2, y = 1, and Event A has occurred twice. That's a really simple example. The user might be (and often is) interested in the time between events, say from A to B, and they might specify the first occurrence of A, then B, or the last occurrence of A, then B (respectively, 9-3 = 6 or 9-4 = 5). Like I said, this is easy to analyze when you're walking over the entire set.
Now, we need to adapt the model to analyze an arbitrary window of time. If we look at 0-N, that's the easy case. But if I look at 1-5, for instance, I have no notion of y unless I begin at 0 and know that y was initially 1 and did not change in the window 1-5.
Our approach is to essentially create a dictionary of variables, and run callbacks on events. If one analysis was "What is x when Event A occurs and time is > 3" then we would run that callback on the first Event A, and it would immediately return because time is not greater than 3. It would run again at 4, and and it would report that x was 1 at t=4.
To adapt to the "time-windowing", I think I am going to (in the background) tack on additional conditions to the analysis. If their analysis is just "What is x when Event A occurs", and the current window is 1-5, then I will change it to "What is x when Event A occurs and time >= 1 and time <= 5". Then if the next window is 6-10, I can readjust the condition as necessary.
My main question is: what pattern does this fit? We are obviously not the first people to approach a problem like this, but I have not been able to find how others have approached it. I probably just don't know what exactly to search on Google. Is there any other approach besides keeping a dictionary of the entire global state for looking up a single state at a given time? Note also that the data could have several, maybe tens of thousands of records, so the fewer iterations over the data set, the better.