views:

225

answers:

3

Given two instances of a class, is it a good and reliable practice to compare them by serializaing them first and then comparing byte arrays (or possibly hashes of arrays). These objects might have complex hierarchical properties but serialization should go as deep as required.

By comparison I mean the process of making sure that all propertis of primitive types have equal values, properties of complex types have equal properties of primitive types, etc. As for collection properties, they should be equal to each other: equal elements, same positions:

{'a','b','c'} != {'a','c','b'}



 {new Customer{Id=2, Name="abc"}, new Customer {Id=3, Name="def"}} 
    !=
 {new Customer{Id=3, Name="def"}, new Customer {Id=2, Name="abc"}}

but

 {new Customer{Id=2, Name="abc"}, new Customer {Id=3, Name="def"}}
    ==
 {new Customer{Id=2, Name="abc"}, new Customer {Id=3, Name="def"}}

And by serialization I mean standard .NET binary formatter.

Thanks.

+1  A: 

You would have to define what equal means here even more precisely.

If one of the properties is a collection, there could be differences in the order (as a result of particular Add/Remove sequences) that may or may not be significant to you. Think about a Dictionary where the same elements were added in a different order. A collision might result in a different binary stream.

Henk Holterman
+2  A: 

It's reliable if:

  1. Every single class in the graph is marked [Serializable]. This isn't as straightforward as it might sound; if you're comparing totally arbitrary objects, then there's a pretty good chance that there's something non-serializable in there that you lack control over.

  2. You want to know if the two instances are exactly the same. Keep in mind that BinaryFormatter is basically diving into the internal state of these objects, so even if they appear to be the same by virtue of public properties, they might not be. If you know for a fact that the graph is created the exact same way in each instance, maybe you don't care about this; but if not, there may be a number of hidden differences between the graphs.

    This point is also a more serious wrinkle than one might first suspect. What if you decide to swap out the class with an interface? You may have two interfaces that, as far as you know, are exactly the same; they accomplish the same task and provide the same data. But they might be totally different implementations. What's nice about IEquatable is that it's independent of the concrete type.

So this will work, for a significant number of cases, but I probably wouldn't consider it a good practice, at least not without knowing all the details about the specific context that it's being used in. At the very least, I would not rely on this as a generic comparison method for any two instances; it should be used only in specific cases where you know about the implementation details of the classes involved.

Of course, some might say that writing any code that depends upon the implementation details of a class is always a bad practice. My take on this is more moderated, but it's something to consider - relying on the implementation details of a class can lead to difficult-to-maintain code later on.

Aaronaught
Regarding point 1), a non-serializable type in the graph will throw an exception.
Henk Holterman
+2  A: 

You are asking for a guarantee that the serialized representation will match. That's going to be awfully hard to come by, BinaryFormatter is a complicated class. Particularly serialized structures that have alignment padding could be a potential problem.

What's much simpler is to provide an example where it won't match. System.Decimal has different byte patterns for values like 0.01M and 0.010M. It's operator==() will say they are equal, it's serialized data won't.

Hans Passant