tags:

views:

233

answers:

5

I got some idea from the posts before are talking about making a hash value on for each $array[$i], and then compare the hash to get the unique array, but I don't know what exactly I can do.

My sample array Data:

$arr[] = array(0,1,2,3);
$arr[] = array(4,5,2,1);
$arr[] = array(0,0,0,0);
$arr[] = array(0,1,2,3);

I expected to return

$arr[] = array(0,1,2,3);
$arr[] = array(4,5,2,1);
$arr[] = array(0,0,0,0);

does anyone can post me a function for this purpose? Many Thanks!

A: 
foreach($arr as $key => $value)
{
   foreach($arr as $key2 => $value2)
   {
      if($value2 == $value && $key != $key2)
       {
          unset($arr[$key]);
       }
    }
}

It isn't the most elegant method, but it does exactly what you need it to do. The problem is you can't use array_unique recursively.

This is another way from the PHP.net documentation comments (Great Code Snippets in there)

function arrayUnique($myArray) 
{ 
    if(!is_array($myArray)) 
           return $myArray; 

    foreach ($myArray as &$myvalue){ 
        $myvalue=serialize($myvalue); 
    } 

    $myArray=array_unique($myArray); 

    foreach ($myArray as &$myvalue){ 
        $myvalue=unserialize($myvalue); 
    } 

    return $myArray; 

}
Chacha102
A: 

Here's another idea. Again, not terribly elegant, but might be pretty fast. It's similar to Chacha102's second part, although it would be faster if you only have interger values in the sub arrays.

// implode the sub arrays
$tmpArray = array();
foreach ($arr as $key => $array) {
    $tmpArray[$key] = implode(',', $array);
}

// get only the unique values
$tmpArray = array_unique($tmpArray);

// explode the values
$arr = array();
foreach ($tmpArray as $key => $string) {
    $arr[$key] = explode(',', $string);
}
Darryl Hein
A: 

Hashing is a good idea, it would make the solution O(n) on average

Basically you iterate through $arr and make a hash of the entire array and then you compare it against the previous hashes you've seen (this is O(1) using isset(), or O(m) to be precise where m is the number of elements in the inner array). and if there is a collision, you compare the actual array elements. usually a collision means that you've seen that array before and it's a duplicate, but that's not guaranteed. here's some psuedo php that implements this algorithm.

function mkhash($array = array()) {
   $hash = "";
   foreach ($array as $element) {
      $hash .= md5($element);
   }
}

$seen = array();
$newArray = array();
foreach($arr as $elementArray) {
   $hash = mkhash($elementArray); 
   if(!isset($seen[$hash])) {
     $newArray[] = $elementArray;
     $seen[$hash] = $elementArray;
   } else if(count(array_diff($elementArray, $seen[$hash])) > 0) {
      $newArray[] = $elementArray; //this is true if two different arrays hashed to the same element
   }
}

The hashing method is harder to implement, and dealing with collisions properly is tricky, so there's the O(nlogn).

The O(nlogn) way of doing this would be to sort the array

$arr = array_multisort($arr); //O(nlogn)

And then all you would have to do is compare adjacent arrays to see if they are duplicates

Of course you can simply use the O(n^2) approach and compare each inner array with every other inner array...

EDIT: oh and here's another O(n) idea, you can recursively build a trie using array keys that map to other arrays, so you end up with a m-level deep array where m is the longest inner array you have. Each branch of the trie represents a unique inner array. Of course you would have to write some overhead code to convert the trie back into a 2D array, so you won't see the performance benefit until the cardinality of your input is very large!

Charles Ma
A: 

It depends on if you have the resources to keep the larger array in memory (so basically, it depends on if you want only unique values to prevent it from bloating during the loop, or if you just need the final outcome to be an array of unique values.

For all examples, I assume you are getting the values to enter into the big array from some external source, like a MySQL query.

To prevent duplicates from being entered into the master array:

You could create two arrays, one with the values as a string, one with the values as actual array values.

while($row = $results->fetch_assoc) {
     $value_string = implode("," $row);
     if(in_array($value_string, $check_array) {
         $check_array[] = $value_string;
         $master_array[] = $row;
      }
 }

In the above, it just sees if the string version of your data set is in the array of string data sets already iterated through. You end up with a bigger overhead with two arrays, but neither ever gets duplicate values.

Or, as already mentioned, I'm sure, there is array_unique, which happens after all data is entered. Modifying the above example, you get

while($row = $results->fetch_assoc) {
     $master_array[] = $row;
   }
 $master_array = array_unique($master_array);
Anthony
Vote downs are very rude without comments. Thanks everybody.
Anthony
+2  A: 

Quick and simple:

$arr = array_map('unserialize', array_unique(array_map('serialize', $arr)));
Alix Axel
nice and clean, less code to break
Phill Pafford
What a nice one!!
Jay
Thank you, you're welcome. =)
Alix Axel