views:

609

answers:

4

Given an array A which holds a permutation of 1,2,...,n. A sub-block A[i..j] of an array A is called a valid block if all the numbers appearing in A[i..j] are consecutive numbers (may not be in order.

Given an array A= [ 7 3 4 1 2 6 5 8] the valid blocks are [3 4], [1,2], [6,5], [3 4 1 2], [3 4 1 2 6 5], [7 3 4 1 2 6 5], [7 3 4 1 2 6 5 8]

Give an O( n log n) algorithm to count the number of valid blocks.

-- Archana

A: 

I don't think you need an algorithm. Here is what you need. Summation (i =0 to i = n-1 ) (n - i) * (i + 1)!

bhups
Do you imply that for a given n, the number of valid blocks is constant ???
mjv
I guess I read the problem incorrectly. What I was thinking is all the possible valid blocks from [1, n]. thanks for pointing that out.
bhups
+1  A: 

This is not quite a full solution, but a starting point for more thought.

The trick appears to lie in the fact that the set is always 1,2,...,n from which it is clear that the entire set is always the first obviously valid block. If you start from that full set and work your way down, it seems to be a little easier to grasp.

The set with the most valid blocks is M = [1 2 3 4 5 . . . n] because every subset [i..j] is guaranteed to be a valid block. M is a good test case because you can easily determine the actual number of valid blocks: ((n - 1) / 2) * n.

NickLarsen
+1  A: 

Imagine you had a function which, given a list of n integers could tell you if they were a valid block or not.

Imagine you modified this function so it instead returned a count of how many sub blocks were valid, so given [1 3 2 4] would find [1 3 2 4], [1 3 2], [3 2 4], [3 2].

To make that modification, you'd just loop through all the possible sub blocks, passing them to the original function until you had just 2 numbers:

1,3,2,4 is valid
1,3,2 is valid
1,3 is NOT valid
3,2,4 is valid
3,2 is valid
There were 4 valid sub blocks

The first function then:

def isValid(li):
    return (max(li)-min(li) == len(li)-1)

That is, assuming all values are different, the largest value minus the smallest should yield the length of the array minus 1. For [1 3 2 4], largest = 4, smallest = 1, max-min=3, length-1 = 3, job done.

The main checking function:

def countValidSubs(li):
    count = 0
    length = len(li)
    for start in range(0,length-2):
        for newlen in range(length-start,1,-1):
            newli = li[start:start+newlen]
            if isValid(newli):
                print(','.join(`i` for i in newli)+" is valid")
                count += 1
            else:
                print(','.join(`i` for i in newli)+" is NOT valid")
    return count

Then just call like:

countValidSubs([1, 3, 2, 4, 5, 7, 9, 8, 6])

(Answer there is 14, by the way)

The only problem with this as a homework answer is that it is O(n2/2)

Phil H
+1 even though it's not n log n
mdma
`isValid` should have an extra line before the return that reads `if len(li) <= 1: return false` for it to be correct.
Allen
+14  A: 

Here's a worst-case O(n log n) divide and conquer algorithm. Given a non-empty sublist of a permutation, divide it into a left half, a middle element, and a right half. Recursively compute the number of blocks contained in the left half and the number of blocks contained in the right half. Now in O(n) time, compute the number of blocks that include the middle element as follows.

Observe that the intersection of two valid blocks is either empty or a valid block and that the whole permutation is a valid block. Together, these facts imply the existence of closures: unique minimal valid blocks that contain a specified (non-contiguous) subsequence. Define a left extension to be a closure of the middle element and an element not in the right half. Define a right extension to be a closure of the middle element and an element not in the left half. The left extensions (respectively, the right extensions) are totally ordered with respect to the sublist relation. They can be computed in order in linear time by means of a simple work-list algorithm.

Observe now that the union of two overlapping valid blocks is itself a valid block. I claim that every valid block containing the middle element can be written as the union of the left extension generated by the leftmost element with the right extension generated by the rightmost element. To count unions of this form, iterate through the left extensions in increasing order. Maintain pointers to the least right extension whose rightmost element is not left of the left extension's rightmost, and to the least whose leftmost element is left of the left extension's leftmost. Because of monotonicity, these pointers can only move to larger extensions, so the total work is linear. They bound above and below the eligible partners for the current left extension.

C++ implementation:

#include <algorithm>
#include <cstdio>
#include <cstdlib>
#include <queue>
#include <stdexcept>
#include <vector>

namespace {
typedef std::vector<int> IntVector;

struct Interval {
  int left;
  int right;
};

Interval MakeInterval(int left, int right) {
  Interval i = {left, right};
  return i;
}

typedef std::vector<Interval> IntervalVector;

enum Direction {
  kLeft,
  kRight,
};

// Finds the valid intervals obtained by starting with [pi[mid],
// pi[mid]] and repeatedly extending in direction dir
//
// O(right_boundary - left_boundary)
void FindExtensions(const IntVector& pi, const IntVector& pi_inv,
                    int left_boundary, int right_boundary,
                    Direction dir, IntervalVector* extensions) {
  int mid = left_boundary + (right_boundary - left_boundary) / 2;
  int left = mid;
  int right = mid;
  int lower = pi[mid];
  int upper = pi[mid];
  std::queue<int> worklist;
  while (true) {
    if (worklist.empty()) {
      extensions->push_back(MakeInterval(left, right));
      if (dir == kLeft) {
        if (left == left_boundary) break;
        --left;
        worklist.push(left);
      } else {
        if (right == right_boundary) break;
        ++right;
        worklist.push(right);
      }
    } else {
      int i = worklist.front();
      worklist.pop();
      if (i < left) {
        if (i < left_boundary) break;
        for (int j = left - 1; j >= i; --j) worklist.push(j);
        left = i;
      } else if (right < i) {
        if (right_boundary < i) break;
        for (int j = right + 1; j <= i; ++j) worklist.push(j);
        right = i;
      }
      int x = pi[i];
      if (x < lower) {
        for (int y = lower - 1; y > x; --y) worklist.push(pi_inv[y]);
        lower = x;
      } else if (upper < x) {
        for (int y = upper + 1; y < x; ++y) worklist.push(pi_inv[y]);
        upper = x;
      }
    }
  }
}

int CountValidRecursive(const IntVector& pi, const IntVector& pi_inv,
                        int left, int right) {
  if (right < left) return 0;
  int mid = left + (right - left) / 2;
  int count = CountValidRecursive(pi, pi_inv, left, mid - 1) +
      CountValidRecursive(pi, pi_inv, mid + 1, right);
  IntervalVector left_exts;
  FindExtensions(pi, pi_inv, left, right, kLeft, &left_exts);
  IntervalVector right_exts;
  FindExtensions(pi, pi_inv, left, right, kRight, &right_exts);
  typedef IntervalVector::const_iterator IVci;
  IVci first_good = right_exts.begin();
  IVci first_bad = right_exts.begin();
  for (IVci ext = left_exts.begin(); ext != left_exts.end(); ++ext) {
    while (first_good != right_exts.end() && first_good->right < ext->right) {
      ++first_good;
    }
    if (first_good == right_exts.end()) break;
    while (first_bad != right_exts.end() && ext->left <= first_bad->left) {
      ++first_bad;
    }
    count += std::distance(first_good, first_bad);
  }
  return count;
}

// Counts the number of intervals in pi that consist of consecutive
// integers
//
// O(n log n)
int CountValid(const IntVector& pi) {
  int n = pi.size();
  IntVector pi_inv(n, -1);
  // Validate and invert pi
  for (int i = 0; i < n; ++i) {
    if (pi[i] < 0 || pi[i] >= n || pi_inv[pi[i]] != -1) {
      throw std::runtime_error("Invalid permutation of {0, ..., n - 1}");
    }
    pi_inv[pi[i]] = i;
  }
  return CountValidRecursive(pi, pi_inv, 0, n - 1);
}

// For testing purposes
int SlowCountValid(const IntVector& pi) {
  int count = 0;
  int n = pi.size();
  for (int left = 0; left < n; ++left) {
    for (int right = left; right < n; ++right) {
      int lower = pi[left];
      int upper = pi[left];
      for (int i = left + 1; i <= right; ++i) {
        if (pi[i] < lower) {
          lower = pi[i];
        } else if (pi[i] > upper) {
          upper = pi[i];
        }
      }
      if (upper - lower == right - left) ++count;
    }
  }
  return count;
}
}  // namespace

int main() {
  int n = 10;
  IntVector pi(n);
  for (int i = 0; i < n; ++i) pi[i] = i;
  do {
    if (SlowCountValid(pi) != CountValid(pi)) {
      fprintf(stderr, "Bad permutation:");
      for (IntVector::const_iterator x = pi.begin(); x != pi.end(); ++x) {
        fprintf(stderr, " %d", *x);
      }
      fputc('\n', stderr);
    }
  } while (std::next_permutation(pi.begin(), pi.end()));
}
That is really impressive. I'm speechless. Congratulations! Takes about 15 seconds on a quadcore to run for a million elements random permutation (even if the result overflows, it's an interesting test), which is very good considering how complex the code is and the STL containers used.
IVlad
One of those answers for which +1 seems utterly inadequate. I run some timings on the functions and the `CountValid` starts beating `SlowCountValid` at around n = 47.
Unreason
Here's the performance compared for n=3..99 with 100 samples (http://i32.tinypic.com/do6n3b.png) and with 1000 samples (http://i32.tinypic.com/33kfb7q.png)
Unreason
The intersection of two valid blocks is not necessarily non-empty and a valid block. Consider [1 2 3 4 5] and [5 6 7], sub blocks of [1..7]. Their intersection is non-empty, [5] and an invalid block. By the way, thanks to Ben Schwehn who pointed out that my "answer" was incorrect. So embarrassing.
Allen
Allen, +1 because you're technically right, but if we define one-element blocks to be valid, everything works out OK.