views:

462

answers:

4

I need to come up with an analysis of simultaneus events, when having only starttime and duration of each event.

Details

I've a standard CDR call detail record, that contains among others:

  • calldate (timedate of each call start
  • duration (int, seconds of call duration)
  • channel (a string)

What I need to come up with is some sort of analysys of simultaneus calls on each second, for a given timedate period. For example, a graph of simultaneous calls we had yesterday.

(The problem is the same if we have visitors logs with duration on a website and wish to obtain simultaneous clients for a group of web-pages)

What would your algoritm be?

I can iterate over records in the given period, and fill an array, where each bucket of the array corresponds to 1 second in the overall period. This works and seems to be fast, but if the timeperiod is big (say..1 year), I would need lots of memory (3600x24x365x4 bytes ~ 120MB aprox).

This is for a web-based, interactive app, so my memory footprint should be small enough.

Edit

By simultaneous, I mean all calls on a given second. Second would be my minimum unit. I cannot use something bigger (hour for example) becuse all calls during an hour do not need to be held at the same time.

+1  A: 

I would implement this on the database. Using a GROUP BY clause with DATEPART, you could get a list of simultaneous calls for whatever time period you wanted, by second, minute, hour, whatever.

On the web side, you would only have to display the histogram that is returned by the query.

Eric Z Beard
A: 

@eric-z-beard: I would really like to be able to implement this on the database. I like your proposal, and while it seems to lead to something, I dont quite fully understand it. Could you elaborate? Please recall that each call will span over several seconds, and each second need to count. If using DATEPART (or something like it on MySQL), what second should be used for the GROUP BY. See note on simultaneus.

Elaborating over this, I found a way to solve it using a temporary table. Assuming temp holds all seconds from tStart to tEnd, I could do

SELECT temp.second, count(call.id)
FROM call, temp
WHERE temp.second between (call.start and call.start + call.duration)
GROUP BY temp.second

Then, as suggested, the web app should use this as a histogram.

Pablo Alsina
A: 

You can use a static Numbers table for lots of SQL tricks like this. The Numbers table simply contains integers from 0 to n for n like 10000.

Then your temp table never needs to be created, and instead is a subquery like:

SELECT StartTime + Numbers.Number AS Second
FROM Numbers
Cade Roux
A: 

You can create table 'simultaneous_calls' with 3 fields:

yyyymmdd  Char(8),
day_second Number,  -- second of the day,
count          Number   -- count of simultaneous calls
Your web service can take 'count' value from this table and make some statistics.

Simultaneous_calls table will be filled by some batch program which will be started every day after end of the day.

Assuming that you use Oracle, the batch may start a PL/SQL procedure which does the following:

  1. Appends table with 24 * 3600 = 86400 records for each second of the day, with default 'count' value = 0.
  2. Defines the 'day_cdrs' cursor for the query:
Select to_char(calldate, 'yyyymmdd')              yyyymmdd,
         (calldate - trunc(calldate)) * 24 * 3600   starting_second,
         duration                                              duration
From cdrs
Where cdrs.calldate >= Trunc(Sysdate -1)
    And cdrs.calldate 
  1. Iterates the cursor to increment 'count' field for the seconds of the call:

For cdr in day_cdrs
Loop 
   Update simultaneos_calls
   Set      count = count + 1
   Where yyyymmdd = cdr.yyyymmdd
       And day_second Between cdr.starting_second And cdr.starting_second + cdr.duration;
End Loop;
Aleksey Otrubennikov