views:

1640

answers:

14

This is one of an interview question which I had recently. I would like to know others perception of approach for this problem.

Question:

You are given a structure which holds an employee details with two elements as int Deptartment and string Name.

struct Employee
{ 
    string Name;
    int Dept;
}

You are given details of N Employees among which there are N/2 employees Dept are 0's and N/2 employees dept are 1's arranged in some random order. You need to sort the employee details based on their Dept value and it should be stable i.e., the order of 1s and 0s in the original record should be maintained.

For example, given the following sample data:

Name         Dept

X1           0
X2           1
X3           0
X4           1
X5           0

after sorting the result should be:

Name         Dept

X2           1
X4           1
X1           0
X3           0
X5           0

The algorithm should be stable and the complexity should be o(N), with constant space for additional variables (which means sorting should be done in-place).

+15  A: 

Allocate a second array (O(N)). Iterate through first array and move all 1's in the order they appear to the second array. Iterate again and move the 0's that are left in the same order to the second array. All operations O(N). This is NOT in situ (in place) solution. A non-stable in situ solution is obtained by running the Quicksort partitioning algorithm once.

After conducting some research, it seems that the known O(N) solutions without any extra memory are not stable. There is academic research on efficient 0-1 stable sorting in situ (in place), but the solutions require some extra memory. I wonder if the original problem statement was not reproduced in an exact fashion. Without stability requirement the problem is very easy; it is also easy without in situ requirement. With BOTH of the requirements (in situ, stable) the solution seems to be elusive.

Among the answers here there is an algorithm that works in O(N) and is in situ, but only if the key field is (1) mutable and (2) can contain an integer instead of a single bit. This works but is not in situ 0-1 stable sorting, because it is assumed that there is O(log N) writable memory available per array element.

antti.huima
Good call if you have the latitude to create a new data structure. Sometimes the requirement is 'in place' but that isn't mentioned, so this is the best option, IMO.
Adam Davis
Yes .. the requirement is in place ..
Ganesh M
Since the problem defines the exact number of zeroes and ones in the data (N/2), you should be able to do the transfer in one pass.
e.James
eJames: how does the known count help? You can always count the number of elements in an extra O(N) pass?
antti.huima
When the count is known to be N/2, you know that the first 0 belongs in slot N/2, the second one in slot N/2+1, etc. So you can handle the 0's in the first pass along with the 1's
Michael Borgwardt
@Michael What you say is true but it does not lead to an in situ sorting algorithm, at least not in any obvious manner. Say you find the first 1 and you know it belongs to cell N/2; now you want to position this 1 in the cell N/2, and you need to swap the object currently in the cell out---but to know where THAT object needs to go you need to count all objects between 0 and N/2!
antti.huima
+2  A: 

The original problem text didn't mention any other fields except the integer (has been edited since then).

In such case stability has no sense since two equal numbers are otherwise indistinguishable. The solution is to just traverse the array and put 1's n/2 times and then 0's n/2 times.

sharptooth
No it is ..Just think the input is a structure which holds a name and a int value. then the order should be maintained.
Ganesh M
There was no 'structure' in the problem.
sharptooth
Stability *only* makes sense when you have equal keys.
Bill the Lizard
@Bill: Yes, but not when you have *only* keys and no satellite data (values). This is a perfect answer to the question as asked.
ShreevatsaR
@ShreevatsaR: That's true, but I don't think that would get you out of solving the problem in an interview. An interviewer is free to say "Hypothetically assume there is sattelite data. How would you solve the problem?"
Bill the Lizard
What exactly is satellite data?
Organiccat
@Organiccat: It's the other data that need to be kept in order. Like if you have a bunch of Student{Name:Age:Gender} records in order by Name, and you want to order them by Gender, but preserve the Name order.
Bill the Lizard
The problem text has been changed to reflect that the 1s and 0s are indeed distinguishable. I propose to delete this answer.
Svante
I've edited the text to reflect that it's outdated but otherwise relevant.
sharptooth
A: 

In abstract terms, you could use a modified insertion sort, swapping all zero values to the right and all one values to the left. That would have complexity greater than O(n), however.

I'm slightly confused why ordering matters, as posted previously, bytes are indistinguishable.

EDIT: The new example helps. The benefit of my algorithm, even though it's slower than linear time in the worst case, is that it's only O(n) in memory usage. Since the ones and zeros are only keys for larger objects (which may be arbitrarily large), memory may be a concern.

perimosocordiae
I guess it's a question to test if an interviewee knows waht stability is. Interview questions often have some strange limitations just to see if the interviewee is knowledgeable enough to at least ask questions.
sharptooth
A: 

Easy:

function sort(array):
    new_array = new Array(array.length)
    x = 0
    for(i : array): if(i): new_array(x++) = 1
    return new_array

And if you actually want to have the same 1s and 0s:

function sort(array):
    new_array = new Array(array.length)
    x, y = 0, array.length / 2
    for(i : array):
        if(i): new_array(x++) = i
        else:  new_array(y++) = i
    return new_array
dsm
+6  A: 

using std::stable_partition together with std::equal_to and std::binder1st should do the trick in a nice, functional, STL-like way:

using namespace std
stable_partition(&array[0], &array[N], binder1st(equal_to(), 1));

Of course, this assumes that the elements of array have some comparison operator defined (i.e. you can say array[i]==1...). If they are just integers, it wouldn't make any sense to maintain the order ...

As to complexity: In order to be O(N), stable_partition needs extra memory. If the algorithm fails to allocate that extra memory, it performs in O(N log N).

MartinStettner
+5  A: 

Okey, here is my approach.

e.g a[] = { 1,0,0,0,1,1,1,0,0,1};

Pseudocode:

  1. Have two counters, count1 = 0 and count2 = (n/2)+1
  2. Traverse through the array,

    if(arr[ i ] == 1) 
    { 
        arr[ i ] = count1++;
    } else { 
        arr[ i ] = count2++ 
    };
    
  3. At the end of the traversal, you have array filled with numbers 0 to n-1 like:

    a[ ] = { 0, 5, 6, 7, 1, 2, 3, 8, 9 4}
    
  4. Now the problem comes to sort the above resultant array, this can be done in O(N) as below:

    for(j = 0; j <= 1; j++)  
    {
        for(i = 0; i<n; i++)  
        {  
            if(arr[ i ] != i)  
            {  
                swap(arr[ i ], arr[ arr[ i ] ]);  
            }  
        }  
    }
    

    Note: j loop runs only twice irrespective on 'n' and has constant complexity. The order of this whole loop is 2*n = O(n).

  5. After the array is sorted, Again traverse through the array and set elements arr[0] to arr[n/2] to '1' and arr[(n/2)+1] to arr[n] as '0'.

Space complexity is constant and time complexity is O(step2) + O(step4) + O(step5) = n + 2n +n = 4*n = O(n).

Ganesh M
Step 4 is not O(N). Consider the input { 1,2,3,4,5,6,7,8,9,0 }
Sanjaya R
STEP4 should be for i in range( len( arr ) ): while arr[i] != i: swap( arr[i], arr[arr[i]] )
Sanjaya R
Step 3 would be "a[] = {0, 6, 7, 8, 1, 2, 3, 9, 10, 4}"
James McMahon
@Sanjaya, Looking at it, I think the input { 1,2,3,4,5,6,7,8,9,0 } would be be impossible to get if you followed Step 2.
James McMahon
This is pretty clever Ganesh, I can't poke any holes in it.
James McMahon
This approach allocates extra memory and is not in situ sort, because the actual bits might be stored in 1-bit wide bitfields.
antti.huima
Except I think you want to assign count2 to (n/2), + 1 is going to walk you off the array bounds, if your array accessor counts from zero.
James McMahon
@anti.huima, I'm dense, could you elaborate, because I am not seeing it?
James McMahon
Oh your saying because the number could be a bit, you couldn't assign the place values listed here, and thus would need to allocate extra memory. What if the data with the bit fields had an additional field, i.e. name, dept, place. You could place the int in the place field and do the second sort
James McMahon
@nemo Yes, assuming that, but originally the problem was posed as sorting bits, and the author rewrote it because there were (useless) comments about "bits not being distinguishable". 0-1 sorting is well defined research topic, and constant space means NO extra O(N) space.
antti.huima
That makes sense with bits. But with ints I don't see any flaws in this implementation.
James McMahon
Can someone prove that the two-pass sorting works in every case that step 2 can produce?
Svante
@Ganesh: antti.huima is correct: although you are not using an entire copy of the original array of structures, you are nevertheless using O(n) additional space to hold arr[].
j_random_hacker
@Svante: I also stumbled upon this, but it's not difficult: After the first pass, all elements in A[0-N/2] are on their place, since all have been treated by the inner loop and therefore been swapped. After he second pass, the same is true for the remaining elements.
MartinStettner
Sort is done in-place, but space is not constant. -1
ashawley
+2  A: 

Used ints instead of bits for simplicity but the underlying concept is the same. Not that the order the different 1s and 0s end up in matters!

var emps = new[] 
           {
               new Employee(X1, 0),
               new Employee(X2, 1),
               new Employee(X3, 0),
               new Employee(X4, 1),
               new Employee(X5, 0),
               new Employee(X6, 1)
           };

var sortedEmps = new Employee[bits.Length];

var oneIndex = 0;
var zeroIndex = bits.Length/2;

foreach (var employee in employees)
{
    if (employee.Dept  == 1)
        sortedEmps[oneIndex++] = employee;
    else
        sortedEmps[zeroIndex++] = employee;
}

Updated to do the employees problem. Added an extra employee as the original question said there were N/2 of each so there has to be an even number for that to be true. Otherwise it's the same.

Not sure this compiles now so treat it as pseudo code!

Garry Shutler
Double memory but single pass I think. Makes it O(n) if my bad knowledge of O notation is correct.
Garry Shutler
A: 

What the heck, here is the whole solution: arr is list of items, item.id is 0 or 1, stored as int.
This code moves the 0s to the front.

count = { 0:0, 1:len(arr)/2 }
for ii in range(len( arr )):
  id = arr[ii].id
  arr[ii].id = count[id]
  count[id] += 1
for ii in range(len( arr )):
  while arr[ii].id != ii:
    T = arr[ii]
    arr[ii] = arr[arr[ii].id]
    arr[T.id] = T
for ii in range(len( arr )):
  arr[ii].id = (ii >= len(arr)/2)
Sanjaya R
I think this is correct an O(N), but is not in situ sort because the counters may require more memory than the actual bits.
antti.huima
Per the problem description, the field "Dept" is an int, so pretend arr[ii].id is arr[ii].Dept. Presto, no overhead.
Sanjaya R
Yes... per the current problem description. But originally the problem was presented as sorting a sequence of bits...
antti.huima
A: 

Here's my (complete) stab at it in C#. It generates the list and sticks random employees in, so make numberOfEmployees whatever you like. I realize there is probably a simpler way of doing this, because the original poster specified that there are an equal amount of 0-dept employees as 1-dept employees, but I couldn't help myself.

struct Employee
{
    public Employee(string name, int dept)
    {
        this.name = name;
        this.dept = dept;
    }

    public string name;
    public int dept;
}

class Program
{
    static void Main(string[] args)
    {
        int numberOfEmployees = 100;
        Random r = new Random();

        Employee[] emps = new Employee[numberOfEmployees];
        var empBuf = new Employee[numberOfEmployees];
        int nextAvail = 0;

        // Initialize array of employees with random data
        for (int i = 0; i < numberOfEmployees; i++)
        {
            emps[i] = new Employee("x" + i.ToString(), r.Next(0, 2));
        }

        Console.WriteLine("Old list:");
        foreach (var e in emps)
        {
            Console.WriteLine("Name: {0}, Dept: {1}", e.name, e.dept);
        }

        // throw employees with dept == 1 in first
        for (int i = 0; i < numberOfEmployees; i++)
        {
            if (emps[i].dept == 1)
            {
                empBuf[nextAvail] = emps[i];
                nextAvail++;
            }
        }

        // stick the employees with dept == 0 in last
        for (int i = 0; i < numberOfEmployees; i++)
        {
            if (emps[i].dept == 0)
            {
                empBuf[nextAvail] = emps[i];
                nextAvail++;
            }
        }


        Console.WriteLine("New list:");
        foreach (Employee e in empBuf)
        {
            Console.WriteLine("Name: {0}, Dept: {1}", e.name, e.dept);
        }

    }
}
Alex Fort
I think this is O(N), but feel free to correct me if I'm wrong.
Alex Fort
It is O(N) but does not do the sorting in-place.
antti.huima
A: 

My implementation in Perl using simple weighting.

use Modern::Perl;

my @data;
{
  use List::MoreUtils qw'natatime';
  my $iter = natatime 2, qw'
    X1  0
    X2  1
    X3  0
    X4  1
    X5  0
  ';

  while( my($a,$b) = $iter->() ){
    push @data, [$a,$b]
  }
}

my @sorted = sort {
  ($a->[1] <=> $b->[1]) * -2 + # gives more weight to second element
  ($a->[0] cmp $b->[0])
} @data;

say "Name\tDept\n";
say join "\t", @$_ for @sorted;
Name    Dept

X2      1
X4      1
X1      0
X3      0
X5      0

The output of <=> and cmp comparisons is a value of -1,0,1.
By multiplying one of the camparisons by two we get one of -2,0,2.
After adding the value of the other operator we get one of -3,-2,-1,0,1,2,3

I wanted to see how many comparisons it would do so I added some debugging information and came up with this.

 a           b        a1      b1     a0     b0
X1,0   ->   X2,1      0   --> 1      X1 <-  X2
X3,0   ->   X4,1      0   --> 1      X3 <-  X4
X2,1  <-    X4,1      1   -   1      X2 <-  X4
X4,1  <-    X1,0      1 <--   0      X4  -> X1
X1,0  <-    X3,0      0   -   0      X1 <-  X3
X2,1 <<-    X5,0      1 <--   0      X2 <-  X5
X5,0   ->>  X4,1      0   --> 1      X5  -> X4
X5,0   ->   X1,0      0   -   0      X5  -> X1
X5,0   ->   X3,0      0   -   0      X5  -> X3

The arrows point at the earlier value.
Brad Gilbert
Sorry, but this is not stable sort.
jpalecek
+1  A: 

That'll be first step of radix sort.

vartec
Right. Which proves you can do it in O(n) performance, but never in-place or with constant space complexity.
ashawley
+1  A: 

Here is solution that works for an array of int. You can modify it.

sort (int [] a) {
   int pos = 0;
   for (int i = 1; i < a.length; i++) {
       if (a[i] == 0 && a[pos] == 1) {
           swap(a, pos, i);   // this makes all 0's go to the left.
           pos++;
       }
   } 
}
fastcodejava
A: 
#include<stdio.h>
//#include<conio.h>

int main()
{
  int i,a[20]={0};
  int *ptr1,*ptr2;
  //clrscr();
  //a[2]=a[4]=a[6]=a[8]=a[16]=1;
  a[19]=a[18]=1;

  for(i=0;i<20;i++)
    printf("%d",a[i]);

  printf("\n\nafter");

  ptr1=ptr2=a;
  for(i=0;i<20;i++)
  {
    if(*ptr1==0&&*ptr2==1)
    {
      int temp=*ptr1;*ptr1=*ptr2;*ptr2=temp;
      ptr1++;ptr2++;
    }
    else if(*ptr1==1&&*ptr2==0)
    {
      ptr1++;ptr2++;
    }
    else if(*ptr1==0&&*ptr2==0)
    {
      ptr2++;
    }
    else
    {
      if(ptr1<ptr2)
        ptr1++;
      else
      {
        ptr1++;ptr2++;
      }
    }
  }

  for(i=0;i<20;i++)
  {
    printf("%d",a[i]);
  }

 // getch();

  return 0;
}
vijaikumar.m
A: 
Esh