views:

141

answers:

1

I need an idea for an efficient index/search algorithm or datastructure for a particular problem. So far I could not come up with something fast and elegant...

Consider a huge collection of objects with every object having a date period (start / end). Huge as in a few million elements. The element periods can be large or small, they can overlap each other partially or not at all. In Java notation, something like this:

interface Period 
{
  long getStart(); // millis since the epoch
  long getEnd();
  boolean intersects(Period p); // trivial intersection check with another period
}

Collection<Period> c = new ArrayList<Period>(); // assume a lot of elements

Now, the problem is to efficiently get to those objects which intersect a given input. For c as an ArrayList this could look like...

Collection<Period> getIntersectingPeriods(Period p)
{
  // how to implement this without full iteration?
  Collection<Period> result = new ArrayList<Period>();
  for (Period element : c)
    if (element.intersects(p))
      result.add(element);
  return result;
}

But full iteration is not the way because there are too many elements and this is an essential operation conducted by many users. Instead of ArrayList something better is needed with some built-in period indexing capabilities.

My best solution involves maintaining two sorted lists internally and conducting 4 binary searches and some list iteration for every request. Any better ideas?

+8  A: 

Interval trees will do:

http://en.wikipedia.org/wiki/Interval_tree

Jules
That was quick and to the point. Thanks a lot!
deepc