views:

159

answers:

3

Hi, and first of all, thank you for taking the time to read my question.

I am trying to write a script, and I've come across an issue which I am finding hard to solve. I am working with a pair of numbers (for example, 1000 and 2000), and I have an array of pairs of numbers:

$pairs = array(
    array(800, 1100),
    array(1500, 1600),
    array(1900, 2100)
)

What I am trying to find, is how to get the ranges not covered by the number pairs, between 1000 and 2000. In this example, 1000-1100 is covered by array(800, 1100), 1500-1600 is covered by array(1500, 1600) and 1900-2000 is covered by array(1900, 2100), which leaves me with 1101-1499 and 1599-1899 left to cover. I hope I am being clear enough.

What I am wondering is how I would make PHP return to me an array of the ranges not covered by the $pairs variable. In this example it would return:

array(
    array(1101, 1499),
    array(1599, 1899)
)

Do you have any idea what would be the best way to do this?

Thank you in advance.

A: 

I would do something like that:

begin = 1000
end   = 2000
uncovered = ()
foreach pairs as pair
  if (pair[0] > begin)
    push (uncovered, begin, pair[0])
    begin = pair[1]
  end if
end foreach

This is only an idea, but here is the point: Consider you have a big segment (from 1000 to 2000) and small one. You want to get each segments of the big one that are not covered by the small one. Imagine you have a pen!

Init the beginning. Iterate on each "small segment" you have. If you are after (strictly) the beginning, then there is a "hole", so you must memorise than from begin to the beginning of the current segment.

Hope this helps, and that this is correct!

Aif
+3  A: 

Well, firstly you have to define the problem:

  1. Are the pairs sorted?
  2. Do pairs overlap?
  3. You want to find the missing ranges for a particular range (this seems to be the case)?

If pairs aren't sorted, first sort them:

usort($pairs, 'cmp_pair');

function cmp_pair($a, $b) {
  if ($a[0] == $b[0]) {
    if ($a[1] == $b[1]) {
      return 0;
    } else {
      return $a[1] < $b[1] ? -1 : 1;
    }
  } else {
    return $a[0] < $b[0] ? -1 : 1;
  }
}

If overlapping ranges are allowed, transform the list of pairs to a non-overlapping set. Here's one suggestion on how to do that:

$prev = false;
$newpairs = array();
foreach ($pairs as $pair) {
  if ($prev) {
    // this also handles the case of merging two ranges
    // eg 100-199 with 200 to 250 to 100-250
    if ($prev[1] >= $pair[0]-1) {
      $prev = array($prev[0], max($prev[1], $pair[1]));
    } else {
      $newpairs[] = $prev;
    }
  }
  $prev = $pair;
}
$pairs = $newpairs;

Now there shouldn't be any overlapping pairs so the problem becomes a little simpler as you've also got a sorted array.

function missing($start, $end, $pairs) {
  $missing = array();
  $prev = false;
  foreach ($pairs as $pair) {
    // if the current pair starts above the end, we're done
    if ($pair[0] > $end) {
      break;
    }

    // we can ignore any pairs that end before the start
    if ($pair[1] < $start) {
      continue;
    }

    // if the pair encompasses the whole range, nothing is missing
    if ($pair[0] <= $start && $pair[1] >= $end) {
      break;
    }

    // if this is our first overlapping pair and it starts above
    // the start we can backfill the missing range
    if ($pair[0] > $start && !$missing) {
      $missing[] = array($start, $pair[0]);
    }

    // compare this pair to the previous one (if there is one) and
    // fill in the missing range
    if ($prev) {
      $missing[] = array($prev[1]+1, $pair[0]-1);
    }

    // set the previous
    $prev = $pair;
  }

  // if this never got set the whole range is missing
  if (!$prev) {
    $missing[] = array($start, $end);

  // if the last overlapping range ended before the end then
  // we are missing a range from the end of it to the end of
  // of the relevant range
  } else if ($prev[1] < $end) {
    $missing[] = array($prev[1]+1, $end);
  }

  // done!
  return $missing;
}

Hope that helps.

cletus
Bruno De Barros
A: 
// your original arrays of integers
$pairs = array(
    array(800, 1100),
    array(1500, 1600),
    array(1900, 2100)
);

// first, normalize the whole thing into a giant list of integers that
// are included in the array pairs, combine and sort numerically
$numbers_in_pairs = array();
foreach($pairs as $set) {
    $numbers_in_pairs = array_merge($numbers_in_pairs, range($set[0], $set[1]));
}
sort($numbers_in_pairs);

// find the min
$min = $numbers_in_pairs[0];

// find the max
$max = $numbers_in_pairs[count($numbers_in_pairs)-1];

Find the array difference

// create an array of all numbers inclusive between the min and max
$all_numbers = range($min, $max);

// the numbers NOT included in the set can be found by doing array_diff
// between the two arrays, we need to sort this to assure no errors when
// we iterate over it to get the maxes and mins
$not_in_set = array_diff($all_numbers, $numbers_in_pairs);
sort($not_in_set);

Metadata about the sets we'll use later:

// gather metadata about the numbers that are not inside the set
// $not_in_set_meta['min'] = lowest integer
// $not_in_set_meta['max'] = highest integer
// $not_in_set_meta['mins'] = min boundary integer
// $not_in_set_meta['maxes'] = max boundary integer
$not_in_set_meta = array();
for($i=0;$i<count($not_in_set);$i++) {
    if ($i == 0) {
     $not_in_set_meta['min'] = $not_in_set[$i];
     $not_in_set_meta['mins'][] = $not_in_set[$i];
    } else if ($i == count($not_in_set)-1 ) {
     $not_in_set_meta['max'] = $not_in_set[$i];
     $not_in_set_meta['maxes'][] = $not_in_set[$i];
    } else {
     // in the event that a number stands alone
     // that it can be BOTH the min and the max
     if (($not_in_set[$i+1] - $not_in_set[$i]) > 1) {
      $not_in_set_meta['maxes'][] = $not_in_set[$i];
     }
     if (($not_in_set[$i] - $not_in_set[$i-1]) > 1) {
      $not_in_set_meta['mins'][] = $not_in_set[$i];
     }
    }
}

Final output:

// The final variable which we'll dump the ranges not covered into:
$non_sets = array();

while(count($not_in_set_meta['mins']) > 0 && count($not_in_set_meta['maxes'])) {
    $non_sets[] = array(array_shift($not_in_set_meta['mins']), 
                        array_shift($not_in_set_meta['maxes']));
}
// print it out:
print var_export($non_sets);

Result:

array (
  0 => 
  array (
    0 => 1101,
    1 => 1499,
  ),
  1 => 
  array (
    0 => 1601,
    1 => 1899,
  ),
)

?>
artlung