views:

159

answers:

2

Hi everyone,

I have a list of session start and stop times, in format like so:

23/11/09 15:18:32, 23/11/09 15:18:40
23/11/09 15:20:02, 23/11/09 15:20:32
23/11/09 15:20:10, 23/11/09 15:20:40

This is in excel at the moment, start datetime in one column end datetime in the other.

Basically what I want to do is some post processing on this data. Im not sure what to use yet, whether to develop my own application, or just use excel, which is where I was hoping for help.

Basically what I want to do is work out say, for a time period x, say for simplicity, the hour 15:00 - 16:00. Based on these sessions, I want to work out how many simultaneous connections during that hour.

Any ideas suggestions, how, what to use? excel? vs2008?

I guess the one nice thing about keeping it in excel is then I can easily produce charts, etc... and i have always struggled working with datetime objects in vs.

A: 

I guess a really inefficient method would be to step through the call start value, checking with EACH call end value which is greater, incrementing the counter if it is greater (being simultaneous call)

then at a particular time i could say how many simultaneous connections their are...

dont think it helps me so much if i want to say something like how many simultaneous between 14:00 and 15:00

ah

baron
+2  A: 

I believe this is not an easy problem at all. Let me re-state your problem: you can consider each connection as a segment on a line, with a start value and end value. What you are trying to find is the largest subset of segments which have a non-null intersection; the number of elements in that subset is the number of concurrent connection. If you can resolve that, finding the answer for a limited time-span should be trivial (just "truncate" each connection so that if the start value is before the period start, the connection starts at the period start).
Now the reason I think this is difficult is that if you take three segments, say, 1:00 to 2:00, 1:00 to 1:15 and 1:45 to 2:00, all 3 overlap, but you only have 2 concurrent connections. So simply checking how many other connections overlap each connection won't be enough, you will have to actually consider how the overlaps overlap, so to speak.
Most things can be done in Excel, at least if you use VBA, so I don't think that's really your issue - the issue is the algorithm itself.
Edit: I did some quick digging, and it seems that Marzullo's algorithm is close to what you are trying to do. I'll dig some more, but that might give you a starting point.
Edit 2 One "cheap" way to get a decent answer / approximation would be to break down the period you are considering in small successive intervals (ex: break one hour in 5-minutes segments) and for each segment, count the connexions that overlap. That should be way faster than explicitly trying to identify the segment where the most segments intersect!

Mathias
Breaking into time segments will be fine!Say so I can report on how many concurrent in a 5, 15, 30, 45, 1hr slots, thats perfect!I can't say im a hundred % with you why this makes the algorithm easier. I am confusing myself stupid this afternoon!thanks.
baron
or maybe even, one way would be to go case by case.very rough idea but for example, list all call start - call end times in a listbox. user clicks call - now I check how many concurrent connections are during this call.then I step through each CallStart>CallEnd value, checking concurrent connections during this call period.Does that make sense?
baron
Just to clarify, my "easy" suggestion is to 1) select what period you want to cover (say, 2pm to 3pm), and then break down the period in smaller intervals (say, 5 mins) - and simply count how many connections intersect each of the 12 5 min intervals. These are not guaranteed to overlap, but this should give you a close estimation. And your number of overlapping connection in the hour is the max observed over the smaller intervals.
Mathias