ansaurus

Question

How to remove duplicated 2-dimension array in PHP?

Answer 1

A:

foreach($arr as $key => $value)
{
   foreach($arr as $key2 => $value2)
   {
      if($value2 == $value && $key != $key2)
       {
          unset($arr[$key]);
       }
    }
}

It isn't the most elegant method, but it does exactly what you need it to do. The problem is you can't use array_unique recursively.

This is another way from the PHP.net documentation comments (Great Code Snippets in there)

function arrayUnique($myArray) 
{ 
    if(!is_array($myArray)) 
           return $myArray; 

    foreach ($myArray as &$myvalue){ 
        $myvalue=serialize($myvalue); 
    } 

    $myArray=array_unique($myArray); 

    foreach ($myArray as &$myvalue){ 
        $myvalue=unserialize($myvalue); 
    } 

    return $myArray; 

}

Chacha102 2009-08-08 04:31:00

Answer 2

A:

Here's another idea. Again, not terribly elegant, but might be pretty fast. It's similar to Chacha102's second part, although it would be faster if you only have interger values in the sub arrays.

// implode the sub arrays
$tmpArray = array();
foreach ($arr as $key => $array) {
    $tmpArray[$key] = implode(',', $array);
}

// get only the unique values
$tmpArray = array_unique($tmpArray);

// explode the values
$arr = array();
foreach ($tmpArray as $key => $string) {
    $arr[$key] = explode(',', $string);
}

Darryl Hein 2009-08-08 04:54:48

Answer 3

A:

Hashing is a good idea, it would make the solution O(n) on average

Basically you iterate through $arr and make a hash of the entire array and then you compare it against the previous hashes you've seen (this is O(1) using isset(), or O(m) to be precise where m is the number of elements in the inner array). and if there is a collision, you compare the actual array elements. usually a collision means that you've seen that array before and it's a duplicate, but that's not guaranteed. here's some psuedo php that implements this algorithm.

function mkhash($array = array()) {
   $hash = "";
   foreach ($array as $element) {
      $hash .= md5($element);
   }
}

$seen = array();
$newArray = array();
foreach($arr as $elementArray) {
   $hash = mkhash($elementArray); 
   if(!isset($seen[$hash])) {
     $newArray[] = $elementArray;
     $seen[$hash] = $elementArray;
   } else if(count(array_diff($elementArray, $seen[$hash])) > 0) {
      $newArray[] = $elementArray; //this is true if two different arrays hashed to the same element
   }
}

The hashing method is harder to implement, and dealing with collisions properly is tricky, so there's the O(nlogn).

The O(nlogn) way of doing this would be to sort the array

$arr = array_multisort($arr); //O(nlogn)

And then all you would have to do is compare adjacent arrays to see if they are duplicates

Of course you can simply use the O(n^2) approach and compare each inner array with every other inner array...

EDIT: oh and here's another O(n) idea, you can recursively build a trie using array keys that map to other arrays, so you end up with a m-level deep array where m is the longest inner array you have. Each branch of the trie represents a unique inner array. Of course you would have to write some overhead code to convert the trie back into a 2D array, so you won't see the performance benefit until the cardinality of your input is very large!

Charles Ma 2009-08-08 04:55:24

Answer 4

A:

It depends on if you have the resources to keep the larger array in memory (so basically, it depends on if you want only unique values to prevent it from bloating during the loop, or if you just need the final outcome to be an array of unique values.

For all examples, I assume you are getting the values to enter into the big array from some external source, like a MySQL query.

To prevent duplicates from being entered into the master array:

You could create two arrays, one with the values as a string, one with the values as actual array values.

while($row = $results->fetch_assoc) {
     $value_string = implode("," $row);
     if(in_array($value_string, $check_array) {
         $check_array[] = $value_string;
         $master_array[] = $row;
      }
 }

In the above, it just sees if the string version of your data set is in the array of string data sets already iterated through. You end up with a bigger overhead with two arrays, but neither ever gets duplicate values.

Or, as already mentioned, I'm sure, there is array_unique, which happens after all data is entered. Modifying the above example, you get

while($row = $results->fetch_assoc) {
     $master_array[] = $row;
   }
 $master_array = array_unique($master_array);

Anthony 2009-08-08 06:06:10

Vote downs are very rude without comments. Thanks everybody.

Anthony 2009-08-08 07:12:33

Answer 5

+2 A:

Quick and simple:

$arr = array_map('unserialize', array_unique(array_map('serialize', $arr)));

Alix Axel 2009-08-08 06:54:57

nice and clean, less code to break

Phill Pafford 2009-08-08 14:48:09

What a nice one!!

Jay 2009-08-09 04:50:55

Thank you, you're welcome. =)

Alix Axel 2009-08-09 05:21:52

ansaurus

tags:

views:

answers:

How to remove duplicated 2-dimension array in PHP?

related questions