views:

242

answers:

4

Hi all, I'm currently trying to verify whether or not, given an unsorted array A of length N and an integer k, whether there exists some element that occurs n/k times or more.

My thinking for this problem was to compute the mode and then compare this to n/k. However, I don't know how to compute this mode quickly. My final result needs to be n*log(k), but I have no idea really on how to do this. The quickest I could find was n*k...

+5  A: 

Use a hash table to count the frequency of each value:

uint[int] counts;
foreach(num; myArray) {
     counts[num]++;
}

int mostFrequent;
uint maxCount = 0;
foreach(num, count; counts) {
    if(count > maxCount) { 
        mostFrequent = num;
        maxCount = count;
    }
}
dsimcha
I was going to suggest this as well. It's a O(N) algorithm.
Welbog
if you only want to know if n/k is exceeded or only need the first element and not all elements that exceed n/k you can add a check in the first loop and terminate when the count exceeds n/k for any element, optionally saving that element if needed.
tvanfosson
This is the right type of answer for the question I asked. However, I forgot to mention I cannot assume the existence of a perfect hash question, but to the question I asked, this is the correct answer.
samoz
+1  A: 

Just walking the array and keeping counts in a hash/dictionary (and returning true once n/k is found, else false) would be O(n)

edit, something like:

counts = {}
for n in numbers:
    if ( counts.has_key( n ) ):
        counts[ n ] += 1
    else:
        counts[ n ] = 1
    if ( counts[ n ] >= n / k ):
        return true
return false
Chris J
A: 

Pseudocode:

 found = false
 value = null
 B = new hashtable
 for (i =0, j = A[i]; i < |A| and !found; ++i, j=A[i])
    if B contains key j
       B[j] = B[j] + 1
       if B[j] > |A|/k
          found = true
          value = j
       endif
    else 
       B[j] = 1
    endif
 end for

Assuming that your hashtable implementation has O(1) insert/lookup this should be O(n)

tvanfosson
+1  A: 

Set m = n/k rounded up. Do a quicksort, but discard sublists of length less than m.

Like quicksort, you can have bad luck and repeatedly choose pivots that close to the ends. But this has a small probability of happening, if you choose the pivots randomly.

There'll be O(log(k)) levels to the recursion, and each level takes O(n) time.

Derek Ledbetter