views:

65

answers:

2

I have couple of data, which includes the beginning of a transaction and its end in a DateTime format. I want to figure out how many transaction are running at the same time. Is there any algorithm that could solve my problem?

+2  A: 

If you have a list of "begins" and "ends", and you want to know the max number of concurrently open connections (or at each point how many connections are open), I would do the following (in pseudo code):

  1. Put each of the "begins" and "ends" into a data structure with timestamp and another field whose value will be +1 for begin and -1 for end
  2. Put all these elements into an array, and sort it by timestamp
  3. Create another array that in each entry has the sum of +1s and -1s from the previous array until that index
  4. The max value of that array is your answer

Example:

Input: 

Connection A opened at 1 closed at 3
Connection B opened at 2 closed at 6
Connection C opened at 4 closed at 7
Connection D opened at 5 closed at 8

Create the following structure:

Timestamp:           1  2  3  4  5  6  7  8
First array sorted: +1 +1 -1 +1 +1 -1 -1 -1 
Second array:        1  2  1  2  3  2  1  0

Max open connections = 3
Number of connections open at timestamp 6 = 2

The second array counts the number of concurrently open connections in each timestamp, and is calculated as follows (pseudo-code):

second_array[i] = sum(first_array[1..i])
Roee Adler
It means that you'll only count how many transaction were opened and how much are closed?
furtelwart
@guerda: not sure what you mean... would you please elaborate?
Roee Adler
You ignore the fact, which transaction is closed at which time. You only count that _one_ transaction is closed. So you simplify the problem very good, imho.
furtelwart
A: 

Full solution given here

Paddy3118