tags:

views:

349

answers:

4

Given an array with N elements. We know that one of those elements repeats itself atleast N/2 times . We dont know anything about the other elements . They may repeat or may be unique .

Is there a way to find out the element that repeats atleast N/2 times in a single pass or may be O(N) ... ???

P.S : No extra space is to be used .

+10  A: 

The Boyer-Moore Majority Vote Algorithm

lst = [1,2,1,2,3,1,3,3,1,2,1,1]
currentCount = 0
currentValue = lst[0]
for val in lst:
   if val == currentValue:
      currentCount += 1
   else:
      currentCount -= 1

   if currentCount == 0:
      currentValue = val
      currentCount = 1


print(currentValue)
st0le
It's pretty simple, you could easily implement it.
st0le
_pretty simple_ Care to explain to the rest of us?
Nikita Rybak
Added an answer with a simple implementation.
David Titarenco
@st0le: Your sample is not correct. You use the index i instead of the list value lst[i] in comparision as well as in assignment. Could you fix it?
harper
@harper, `i` contains the value itself...i guess, `i` convention threw you off. i'll rename it.
st0le
@st0le: Apologize please. Your right, the 'for' notation reminded to C++ syntax, that screw me.
harper
A: 

It doesn't seem possible to count anything without using extra space. You have to store atleast one counter somewhere. If you mean to say you cannot use more than O(n) space then it should be fairly easy.

One way would be to create a second list of only unique objects from the original list. Then, create a third list the same length as the second with a counter for the number of occurrences of each item in the list.

Another way would be to sort the list then find the largest contiguous section.

puddingfox
+1 - probably not the optimal solution, but O(n log n) for the sort-based solution is a good trade-off versus complexity of other methods.
Will A
You aren't trying to count. You are trying to find a number that occurs at least half the time.
MSN
+9  A: 

st0le answered the question, but here's a 5minute implementation:

#include <stdio.h>

#define SIZE 13

int boyerMoore(int arr[]) {
    int current_candidate = arr[0], counter = 0, i;
    for (i = 0; i < SIZE; ++i) {
        if (current_candidate == arr[i]) {
            ++counter;
            printf("candidate: %i, counter: %i\n",current_candidate,counter);
        } else if (counter == 0) {
            current_candidate = arr[i];
            ++counter;
            printf("candidate: %i, counter: %i\n",current_candidate,counter);
        } else {
            --counter;
            printf("candidate: %i, counter: %i\n",current_candidate,counter);
        }
    }
    return current_candidate;
}

void main() {
    int numbers[SIZE] = {5,5,5,3,3,1,1,3,3,3,1,3,3};
    printf("majority: %i\n", boyerMoore(numbers));
}

And here's a fun explanation (more fun than reading the paper, at least): http://userweb.cs.utexas.edu/~moore/best-ideas/mjrty/index.html

David Titarenco
Thanks! Quite beautiful idea. (BTW, you would surely get some more upvotes if explained concept to everybody in plain English. It's not difficult at all.)
Nikita Rybak
This algorithm satisfies the conditions of the question. However, one should keep in mind that it returns the potential majority item. If there is no majority, then the result is meaningless. Thus, if you are unsure, you have to loop a second time and see how many times that element actually appears.
konforce
The question postulates that `we know that one of those elements repeats itself *at least* N/2 times` so if the data is well-formed, the algorithm will work every time.
David Titarenco
+1 for a link to an excellent explanation of a brilliant algorithm I've never seen before.
Mark Rushakoff
Actually your algorithm will output the incorrect value for inputs such as {1, 1, 2, 3}. This is trivial to fix though.
Nabb
@Nabb - Moore addresses the caveat: `Note that if you replaced the first C with an A, above, the algorithm would still end with C being chosen, but in fact C would not be the majority element: there is no majority in the modified list.In some situations you know, or assume, there is a majority element.But if you must confirm that the chosen element is indeed the majority element, you must take one more linear pass through the data to do it. `
David Titarenco
@David - Sorry, misinterpreted the question (specifically the "repeats itself" part).
Nabb
@David, it's possibly incorrect, when assigining a new candidate, your count should be assigned `=1`
st0le
@st0le, thanks for the catch, fixed :)
David Titarenco
+13  A: 

As the other users have already posted the algorithm, I won't repeat that. However, I provide a simple explanation as to why it works:

Consider the following diagram, which is actually a diagram of unpolarized light:

arrows radiating from the centre

Each arrow from the centre represents a different candidate. Imagine a point somewhere on an arrow representing the counter and candidate. Initially the counter is at zero, so it begins in the centre.
When the current candidate is found, it moves one step in the direction of that arrow. If a different value is found, the counter moves one step towards the centre.
If there is a majority value, more than half of the moves will be towards that arrow, and hence the algorithm will end with the current candidate being the majority value.

Nabb
+1 beautiful! '
Nikita Rybak
+1 Nicely done.
codaddict