views:

123

answers:

7

I'm looking for an efficient algorithm for detecting equal values in an array of integers N size. It must return the indices of the matches.

Alas, I can't think of anything more clever then brute force with two loops.

Any help will be appreciated. Thanks!

+2  A: 

You could intersect the array. This finds all the values of array2 that are in array1

$array1 = array("a" => "green", "b" => "brown", "c" => "blue", "red");
$array2 = array("a" => "green", "yellow", "red");
$result_array = array_intersect_assoc($array1, $array2);
print_r($result_array);

Would return

Array
(
    [a] => green
)

It returns an array with all of the keys and values of the matches. Basically you can provide an infinite number of arguments to the array_insert_assoc:

array_intersect_assoc($base_array, $arr1, $arr2 ...);

It will search $base_array for the values that are in all the subsequent arrays. That means that the key and value will be taken from the $base_array

You could also compare the keys by using:

array_intersect_keys($base_array, $arr1, $arr2, $arr3);
Chacha102
not sure, i mentioned that. but regardless, do you know if that can return the indexes of the matches found?
Nona Urbiz
Yes. `array_intersect_assoc` will.
Chacha102
A: 

Perhaps this?

$arr = array_map('unserialize', array_unique(array_map('serialize', $arr)));

From the question: How to remove duplicated 2-dimension array in PHP?

if ($arr !== array_map('unserialize', array_unique(array_map('serialize', $arr))))
{
    // found duplicates
}
Alix Axel
A: 

You don't have to go through all the array again for each element. Only test an element with the subsequent element in the array:

$array = /* huge array */;
$size = count($array);
for($i = 0; $i < $size; $i++)
{
    for($j = $i + 1; $j < $size; $j++) // only test with the elements after $i
    {
        if($array[$i] == $array[$j])
            return true; // found a duplicate
    }
    return false; // found no duplicate
}

That's the most efficient way I can think of. Adapt it to your need as you will.

zneak
does cut down time a little time, but its still in O(n^2)
Anurag
+1  A: 

These loops are O(N^2). Is N big? If so, can you sort the array O(NlogN), then scan it O(N)? ... or am I missing something?

Mike Dunlavey
And if you pair each element with its original array index and sort them together, you can recover the starting array indices for any elements that match.
Jim Lewis
@Jim: that space stuff you work on sounds cool.
Mike Dunlavey
A: 

If one of your arrays is reasonably static (that is you are comparing to the same array several times ) you could invert it.

That is set up another array which is keyed by value and returns the index into the real array.

$invert = array();
foreach ($cmptoarray as $ix => $ival) {
   $invert[$ival] = $ix;
}

Then you simply need an if ( isset($invert[$compfrmarray[$i]) ) .... to check the number.

Note: this is only worth doing if you compare against the same array several times!

James Anderson
A: 

You can use a set to hold the recent values. For example,

results = empty list
set = empty set
foreach key, val in array:
   if val is not in set: add val to set
   else: add key to results
return results

Each look up of set is O(1), so this algo will results in O(n) instead of O(n^2) if nested-loop is used.

In case you want to keep track of multi-occurence like this array 1, 2, 3, 3, 2, 1 you can use a hash table with key is the value and value (of the corresponding key in table) is the list of indices. The result for the given array will look lik {1:0, 5; 2: 1, 4; 3: 2, 3}.

results = empty hashtable
for each key, val in array:
    if val is not in results:
        results[val] = new list()
    results[val].append(key)
return results
Set lookups and insertions are generally O(log N).
Jim Lewis
A: 

Just use an associative array mapping a value to its index:

foreach($array1 as $index => $value) {
    $aa[$value] = $index;
}

foreach($array2 as $index => $value) {
    if(isset($aa[$value])) {
        echo 'Duplicate:  .  Index 1:  '.$aa[$value].'  Index 2:  '.$index.'.';
    }
}
dsimcha