views:

228

answers:

3

I have got a series of time intervals (t_start,t_end), that cannot overlap, i.e.: t_end(i) > t_start(i+1). I want to do the following operations:

1) Add new (Union of) intervals [ {(1,4),(8,10)} U (3,7) = {(1,7),(8,10)} ]
2) Take intervals out [ (1,7) - (3,5) = {(1,3),(5,7)}
3) Checking whether a point or a interval overlaps with an interval in my series (intersection)
4) Finding the first "non-interval" of a minimum length after some point [ {(1,4),(7,8)}: there is a "non-interval" of length 3 between 4 and 7 ].

I want to know good ways of implementing this, with low complexities (log n for all operations would do it).

Related question: http://stackoverflow.com/questions/1580185/data-structure-for-quick-time-interval-look-up

+4  A: 

Without knowing anymore specifics, I'd suggest reading about Interval Trees. Interval trees are a special 1 dimensional case of more generic kd-trees, and have a O(n log n) construction time, and O(log n) typical operation times. Exact algorithm implementations you'd need to find yourself, but you can start by looking at CGAL.

Kornel Kisielewicz
A: 

I would start with an array, in the following manner:

  • use some conversion to convert time time a number (unix epoch style?!)
  • use the interval start as the key, and the interval end as the value of the elements in the (one dimensional) array.

Don't know about complexities, there, but you should be able to easily set, unset, and sort the array (by key and value) for this to work. Also, working with plain int's rules, speed-wise.

Ralf MC
+6  A: 

It sounds like you could just use a balanced binary tree of all the boundary times.

For example, represent {(1,4), (8,10), (12,15)} as a tree containing 1, 4, 8, 10, 12, and 15.

Each node needs to say whether it's the start or end of an interval. So:

                          8 (start)
                         /        \
                1 (start)         12 (start)
                      \             /      \
                     4 (end)   10 (end)   15 (end)

(Here all the "end" nodes ended up at the bottom by coincidence.)

Then I think you can have all your operations in O(log n) time. To add an interval:

  • Find the start time. If it's already in the tree as a start time, you can leave it there. If it's already in the tree as an end time, you'll want to remove it. If it's not in the tree and it doesn't fall during an existing interval, you'll want to add it. Otherwise you don't want to add it.

  • Find the stop time, using the same method to find out if you need to add it, remove it, or neither.

  • Now you just want to add or remove the abovementioned start and stop nodes and, at the same time, delete all the existing nodes in between. To do this you only need to rebuild the tree nodes at or directly above those two places in the tree. If the height of the tree is O(log n), which you can guarantee by using a balanced tree, this takes O(log n) time.

(Disclaimer: If you're in C++ and doing explicit memory management, you might end up freeing more than O(log n) pieces of memory as you do this, but really the time it takes to free a node should be billed to whoever added it, I think.)

Removing an interval is largely the same.

Checking a point or interval is straightforward.

Finding the first gap of at least a given size after a given time can be done in O(log n) too, if you also cache two more pieces of information per node:

  • In each start node (other than the leftmost), the size of the gap immediately to the left.

  • In every node, the size of the largest gap that appears in that subtree.

To find the first gap of a given size that appears after a given time, first find that time in the tree. Then walk up until you reach a node that claims to contain a large enough gap. If you came up from the right, you know this gap is to the left, so you ignore it and keep walking up. Otherwise you came from the left. If the node is a start node, check to see if the gap to its left is large enough. If so, you're done. Otherwise, the large-enough gap must be somewhere to the right. Walk down to the right and continue down until you find the gap. Again, because the height of the tree is O(log n), walking it three times (down, up, and possibly down again) is O(log n).

Jason Orendorff
Um, what about interval union?
Kornel Kisielewicz
To add two general interval-sets is O(n) with this data structure. Adding a single interval to an interval-set is O(log n).
Jason Orendorff
I have yet to check for correctness, but this is a great answer anyway. It's not a simple algorithm, but if well encapsulated, it would be really easy to link.
Luís Guilherme