views:

55

answers:

2

I often come across a scenario where I have two collections of objects (either array or IteratorAggregate class) and need to diff the two lists.

By diff, I mean:

  • Detect duplicate objects (logic for detecting duplicates would vary case-by-case)
  • Add new objects
  • Remove objects that aren't in the other list

Essentially, I'm looking for something like array_diff that works with objects. Up to now, I've just been writing the same logic over and over for each type of collection. Obviously, since the conditions for duplicate objects will differ from case to case, there's not a singular solution. But is there a common pattern or abstraction that people have found to be an elegant way to deal with this?

+2  A: 

spl_object_hash will help you determine if two objects are the same.

John Conde
I didn't know about that, so thanks. Problem is, I'm trying to compare objects from an ORM, so while objects might be semantically equal, they all have unique ids in their properties, and would produce different hash values.
Bryan M.
+1  A: 

Since PHP5.2 there is a native Object collection with SplObjectStorage:

The SplObjectStorage class provides a map from objects to data or, by ignoring data, an object set. This dual purpose can be useful in many cases involving the need to uniquely identify objects.

Example

$obj1 = new StdClass; $obj1->prop = 1;
$obj2 = new StdClass; $obj2->prop = 2;
$obj3 = new StdClass; $obj3->prop = 3;
$obj4 = new StdClass; $obj4->prop = 4;
$obj5 = new StdClass; $obj5->prop = 5;

$collection1 = new SplObjectStorage;
$collection1->attach($obj1);
$collection1->attach($obj2);
$collection1->attach($obj3);

$collection2 = new SplObjectStorage;
$collection2->attach($obj3);
$collection2->attach($obj4);
$collection2->attach($obj5);   

SplObjectStorage implements Countable, Iterator, Traversable, Serializable and ArrayAccess (since 5.3), so you can iterate over it as easily as over any other Traversable. The same Object cannot appear twice in an SplObjectStorage when it is used as object set. You can easily compare two collections with the following function:

function collection_diff(SplObjectStorage $c1, SplObjectStorage $c2)
{
    $diff = new SplObjectStorage;
    foreach($c1 as $o) {
        if(!$c2->contains($o)) {
            $diff->attach($o);
        }
    }
    return $diff;
}

Of course, you can adjust this to use a custom comparison. Usage is simple:

$diff = collection_diff($collection1, $collection2);
var_dump( $diff ); // will contain $obj1 and $obj2

Further reading:

Gordon
Great answer, thanks for all the examples. I the end, I just went with object hashing. I'll definitely look into developing a more robust solution down the road.
Bryan M.