views:

165

answers:

6

Problem: Given an input array of integers of size n, and a query array of integers of size k, find the smallest window of input array that contains all the elements of query array and also in the same order.

I have tried below approach.

        int[] inputArray = new int[] { 2, 5, 2, 8, 0, 1, 4, 7 };
        int[] queryArray = new int[] { 2, 1, 7 };

Will find the position of all query array element in inputArray.

public static void SmallestWindow(int[] inputArray, int[] queryArray)
    {
        Dictionary<int, HashSet<int>> dict = new Dictionary<int, HashSet<int>>();

        int index = 0;
        foreach (int i in queryArray)
        {
            HashSet<int> hash = new HashSet<int>();
            foreach (int j in inputArray)
            {
                index++;
                if (i == j)
                    hash.Add(index); 
            }
            dict.Add(i, hash);
            index = 0;
        }
      // Need to perform action in above dictionary.??
    }

I got following dictionary

  1. int 2--> position {1, 3}
  2. int 1 --> position {6}
  3. int 7 --> position {8}

Now I want to perform following step to findout minimum window

  1. Compare int 2 position to int 1 position. As (6-3) < (6-1)..So I will store 3, 6 in a hashmap.

  2. Will compare the position of int 1 and int 7 same like above.

I cannot understand how I will compare two consecutive value of a dictionary. Please help.

A: 

I don't see how using HashSet and Dictionary will help you in this. Were I faced with this problem, I'd go about it quite differently.

One way to do it (not the most efficient way) is shown below. This code makes the assumption that queryArray contains at least two items.

int FindInArray(int[] a, int start, int value)
{
    for (int i = start; i < a.Length; ++i)
    {
        if (a[i] == value)
            return i;
    }
    return -1;
}

struct Pair
{
    int first;
    int last;
}

List<Pair> foundPairs = new List<Pair>();

int startPos = 0;
bool found = true;
while (found)
{
    found = false;
    // find next occurrence of queryArray[0] in inputArray
    startPos = FindInArray(inputArray, startPos, queryArray[0]);
    if (startPos == -1)
    {
        // no more occurrences of the first item
        break;
    }
    Pair p = new Pair();
    p.first = startPos;
    ++startPos;
    int nextPos = startPos;
    // now find occurrences of remaining items
    for (int i = 1; i < queryArray.Length; ++i)
    {
        nextPos = FindInArray(inputArray, nextPos, queryArray[i]);
        if (nextPos == -1)
        {
            break;  // didn't find it
        }
        else
        {
            p.last = nextPos++;
            found = (i == queryArray.Length-1);
        }
    }
    if (found)
    {
        foundPairs.Add(p);
    }
}

// At this point, the foundPairs list contains the (start, end) of all
// sublists that contain the items in order.
// You can then iterate through that list, subtract (last-first), and take
// the item that has the smallest value.  That will be the shortest sublist
// that matches the criteria.

With some work, this could be made more efficient. For example, if 'queryArray' contains [1, 2, 3] and inputArray contains [1, 7, 4, 9, 1, 3, 6, 4, 1, 8, 2, 3], the above code will find three matches (starting at positions 0, 4, and 8). Slightly smarter code could determine that when the 1 at position 4 is found, since no 2 was found prior to it, that any sequence starting at the first position would be longer than the sequence starting at position 4, and therefore short-circuit the first sequence and start over at the new position. That complicates the code a bit, though.

Jim Mischel
+2  A: 

The algorithm:
For each element in the query array, store in a map M (V → (I,P)), V is the element, I is an index into the input array, P is the position in the query array. (The index into the input array for some P is the largest such that query[0..P] is a subsequence of input[I..curr])

Iterate through the array.
If the value is the first term in the query array: Store the current index as I.
Else: Store the value of the index of the previous element in the query array, e.g. M[currVal].I = M[query[M[currVal].P-1]].I.
If the value is the last term: Check if [I..curr] is a new best.

Complexity
The complexity of this is O(N), where N is the size of the input array.

N.B.
This code expects that no elements are repeated in the query array. To cater for this, we can use a map M (V → listOf((I,P))). This is O(N*hC(Q)), where hC(Q) is the count of the mode for the query array..
Even better would be to use M (V → listOf((linkedList(I), P))). Where repeated elements occur consecutively in the query array, we use a linked list. Updating those values then becomes O(1). The complexity is then O(N*hC(D(Q))), where D(Q) is Q with consecutive terms merged.

Implementation
Sample java implementation is available here. This does not work for repeated elements in the query array, nor do error checking, etc.

Nabb
A: 

You want not a HashSet but a (sorted) tree or array as the value in the dictionary; the dictionary contains mappings from values you find in the input array to the (sorted) list of indices where that value appears.

Then you do the following

  • Look up the first entry in the query. Pick the lowest index where it appears.
  • Look up the second entry; pick the lowest entry greater than the index of the first.
  • Look up the third; pick the lowest greater than the second. (Etc.)
  • When you reach the last entry in the query, (1 + last index - first index) is the size of the smallest match.
  • Now pick the second index of the first query, repeat, etc.
  • Pick the smallest match found from any of the starting indices.

(Note that the "lowest entry greater" is an operation supplied with sorted trees, or can be found via binary search on a sorted array.)

The complexity of this is approximately O(M*n*log n) where M is the length of the query and n is the average number of indices at which a given value appears in the input array. You can modify the strategy by picking that query array value that appears least often for the starting point and going up and down from there; if there are k of those entries (k <= n) then the complexity is O(M*k*log n).

Rex Kerr
A: 

After you got all the positions(indexes) in the inputArray:

2 --> position {0,2}   // note: I change them to 0-based array
1 --> position {5,6}  // I suppose it's {5,6} to make it more complex, in your code it's only {5}
7 --> position {7}

I use a recursion to get all possible paths. [0->5->7] [0->6->7] [2->5->7] [2->6->7]. The total is 2*2*1=4 possible paths. Obviously the one who has Min(Last-First) is the shortest path(smallest window), those numbers in the middle of the path don't matter. Here comes the code.

 struct Pair
 {
     public int Number;  // the number in queryArray
     public int[] Indexes;  // the positions of the number
 }
 static List<int[]> results = new List<int[]>(); //store all possible paths
 static Stack<int> currResult = new Stack<int>(); // the container of current path
 static int[] inputArray, queryArray; 
 static Pair[] pairs;

After the data structures, here is the Main.

inputArray = new int[] { 2, 7, 1, 5, 2, 8, 0, 1, 4, 7 }; //my test case
queryArray = new int[] { 2, 1, 7 };
pairs = (from n in queryArray
      select new Pair { Number = n, Indexes = inputArray.FindAllIndexes(i => i == n) }).ToArray();
Go(0);

FindAllIndexes is an extension method to help find all the indexes.

public static int[] FindAllIndexes<T>(this IEnumerable<T> source, Func<T,bool> predicate)
{
     //do necessary check here, then
     Queue<int> indexes = new Queue<int>();
     for (int i = 0;i<source.Count();i++)
           if (predicate(source.ElementAt(i))) indexes.Enqueue(i);
     return indexes.ToArray();
}

The recursion method:

static void Go(int depth)
{
    if (depth == pairs.Length)
    {
        results.Add(currResult.Reverse().ToArray());
    }
    else
    {
        var indexes = pairs[depth].Indexes;
        for (int i = 0; i < indexes.Length; i++)
        {
            if (depth == 0 || indexes[i] > currResult.Last())
            {
                currResult.Push(indexes[i]);
                Go(depth + 1);
                currResult.Pop();
            }
        }
    }
}

At last, a loop of results can find the Min(Last-First) result(shortest window).

Danny Chen
A: 

Algorithm:

  1. get all indexes into the inputArray of all queryArray values
  2. order them ascending by index
  3. using each index (x) as a starting point find the first higher index (y) such that the segment inputArray[x-y] contains all queryArray values
  4. keep only those segments that have the queryArray items in order
  5. order the segments by their lengths, ascending

c# implementation:

First get all indexes into the inputArray of all queryArray values and order them ascending by index.

public static int[] SmallestWindow(int[] inputArray, int[] queryArray)
{
    var indexed = queryArray
        .SelectMany(x => inputArray
                             .Select((y, i) => new
                                 {
                                     Value = y,
                                     Index = i
                                 })
                             .Where(y => y.Value == x))
        .OrderBy(x => x.Index)
        .ToList();

Next, using each index (x) as a starting point find the first higher index (y) such that the segment inputArray[x-y] contains all queryArray values.

    var segments = indexed
        .Select(x =>
            {
                var unique = new HashSet<int>();
                return new
                    {
                        Item = x,
                        Followers = indexed
                            .Where(y => y.Index >= x.Index)
                            .TakeWhile(y => unique.Count != queryArray.Length)
                            .Select(y =>
                                {
                                    unique.Add(y.Value);
                                    return y;
                                })
                            .ToList(),
                        IsComplete = unique.Count == queryArray.Length
                    };
            })
        .Where(x => x.IsComplete);

Now keep only those segments that have the queryArray items in order.

    var queryIndexed = segments
        .Select(x => x.Followers.Select(y => new
            {
                QIndex = Array.IndexOf(queryArray, y.Value),
                y.Index,
                y.Value
            }).ToArray());

    var queryOrdered = queryIndexed
        .Where(item =>
            {
                var qindex = item.Select(x => x.QIndex).ToList();
                bool changed;
                do
                {
                    changed = false;
                    for (int i = 1; i < qindex.Count; i++)
                    {
                        if (qindex[i] <= qindex[i - 1])
                        {
                            qindex.RemoveAt(i);
                            changed = true;
                        }
                    }
                } while (changed);
                return qindex.Count == queryArray.Length;
            });

Finally, order the segments by their lengths, ascending. The first segment in the result is the smallest window into inputArray that contains all queryArray values in the order of queryArray.

    var result = queryOrdered
        .Select(x => new[]
            {
                x.First().Index,
                x.Last().Index
            })
        .OrderBy(x => x[1] - x[0]);

    var best = result.FirstOrDefault();
    return best;
}

test it with

public void Test()
{
    var inputArray = new[] { 2, 1, 5, 6, 8, 1, 8, 6, 2, 9, 2, 9, 1, 2 };
    var queryArray = new[] { 6, 1, 2 };

    var result = SmallestWindow(inputArray, queryArray);

    if (result == null)
    {
        Console.WriteLine("no matching window");
    }
    else
    {
        Console.WriteLine("Smallest window is indexes " + result[0] + " to " + result[1]);
    }
}

output:

Smallest window is indexes 3 to 8
Handcraftsman
A: 

Thank you everyone for your inputs. I have changed my code a bit and find it working. Though it might not be very efficient but I'm happy to solve using my head :). Please give your feedback

Here is my Pair class with having number and position as variable

    public class Pair
    {
    public int Number;
    public List<int> Position;
    }

Here is a method which will return the list of all Pairs.

     public static Pair[]  GetIndex(int[] inputArray, int[] query)
      {
        Pair[] pairList = new Pair[query.Length]; 
        int pairIndex = 0;
        foreach (int i in query)
        {
            Pair pair = new Pair();
            int index = 0;
            pair.Position = new List<int>();
            foreach (int j in inputArray)
            {                    
                if (i == j)
                {
                    pair.Position.Add(index);
                }
                index++;
            }
            pair.Number = i;
            pairList[pairIndex] = pair;
            pairIndex++;
        }
        return pairList;
    }

Here is the line of code in Main method

        Pair[] pairs = NewCollection.GetIndex(array, intQuery);

        List<int> minWindow = new List<int>();
        for (int i = 0; i <pairs.Length - 1; i++)
        {
            List<int> first = pairs[i].Position;
            List<int> second = pairs[i + 1].Position;
            int? temp = null;
            int? temp1 = null;
            foreach(int m in first)
            {
                foreach (int n in second)
                {
                    if (n > m)
                    {
                        temp = m;
                        temp1 = n;
                    }                        
                }                    
            }
            if (temp.HasValue && temp1.HasValue)
            {
                if (!minWindow.Contains((int)temp))
                    minWindow.Add((int)temp);
                if (!minWindow.Contains((int)temp1))
                    minWindow.Add((int)temp1);
            }
            else
            {
                Console.WriteLine(" Bad Query array");
                minWindow.Clear();
                break;                    
            }
        }

        if(minWindow.Count > 0)
        {
         Console.WriteLine("Minimum Window is :");
         foreach(int i in minWindow)
         {
             Console.WriteLine(i + " ");
         }
        }
Pritam Karmakar