views:

177

answers:

6

After seeing this question, my first thought was that it'd be trivial to define generic equivalence and relational operators:

#include <cstring>

template<class T>
bool operator==(const T& a, const T& b) {

    return std::memcmp(&a, &b, sizeof(T)) == 0;

}

template<class T>
bool operator<(const T& a, const T& b) {

    return std::memcmp(&a, &b, sizeof(T)) < 0;

}

using namespace std::rel_ops would then become even more useful, since it would be made fully generic by the default implementations of operators == and <. Obviously this does not perform a memberwise comparison, but instead a bitwise one, as though the type contains only POD members. This is not entirely consistent with how C++ generates copy constructors, for instance, which do perform memberwise copying.

But I wonder whether the above implementation is indeed safe. The structures would naturally have the same packing, being of the same type, but are the contents of the padding guaranteed to be identical (e.g., filled with zeros)? Are there any reasons why or situations in which this wouldn't work?

+2  A: 

That's highly dangerous because the compiler will use these definitions not only for plain old structs, but also for any classes, however complex, for which you forgot to define == and < properly.

One day, it will bite you.

Alexander Gessler
You could say exactly the same thing about default copy constructors and assignment operators. How dangerous this is from a design standpoint isn't really my question. How dangerous it is from an undefined/unspecified/implementation-defined behaviour standpoint is.
Jon Purdy
A: 

A lot can depend on your definition of equivalence.

e.g. if any of the members that you are comparing within your classes are floating point numbers.

The above implementation may treat two doubles as not equal even though they came from the same mathematical calculation with the same inputs - as they may not have generated exactly the same output - rather two very similar numbers.

Typically such numbers should be compared numerically with an appropriate tolerance.

morechilli
+4  A: 

Even for POD, == operator can be wrong. This is due to alignment of structures like the following one which takes 8 bytes on my compiler.

class Foo {
  char foo; /// three bytes between foo and bar
  int bar;
};
tibur
+12  A: 

No -- just for example, if you have T==(float | double | long double), your operator== doesn't work right. Two NaNs should never compare as equal, even if they have the identical bit pattern (in fact, one common method of detecting a NaN is to compare the number to itself -- if it's not equal to itself, it's a NaN). Likewise, two floating point numbers with all the bits in their exponents set to 0 have the value 0.0 (exactly) regardless of what bits might be set/clear in the significand.

Your operator< has even less chance of working correctly. For example, consider a typical implementation of std::string that looks something like this:

template <class charT>
class string { 
    charT *data;
    size_t length;
    size_t buffer_size;
public:
    // ...
};

With this ordering of the members, your operator< will do its comparison based on the addresses of the buffers where the strings happen to have stored their data. If, for example, it happened to have been written with the length member first, your comparison would use the lengths of the strings as the primary keys. In any case, it won't do a comparison based on the actual string contents, because it will only ever look at the value of the data pointer, not whatever it points at, which is what you really want/need.

Edit: As far as padding goes, there's no requirement that the contents of padding be equal. It's also theoretically possible for padding to be some sort of trap representation that will cause a signal, throw an exception, or something on that order, if you even try to look at it at all. To avoid such trap representations, you need to use something like a cast to look at it as a buffer of unsigned chars. memcmp might do that, but then again it might not...

Also note that being the same types of objects does not necessarily mean the use the same alignment of members. That's a common method of implementation, but it's also entirely possible for a compiler to do something like using different alignments based on how often it "thinks" a particular object will be used, and include a tag of some sort in the object (e.g., a value written into the first padding byte) that tells the alignment for this particular instance. Likewise, it could segregate objects by (for example) address, so an object located at an even address has 2-byte alignment, at an address that's a multiple of four has 4-byte alignment, and so on (this can't be used for POD types, but otherwise, all bets are off).

Neither of these is likely or common, but offhand I can't think of anything in the standard that prohibits them either.

Jerry Coffin
I'm perfectly aware that it probably doesn't *work* correctly for most situations. What I'm asking is whether it invokes undefined, unspecified, or implementation-defined behaviour to perform a bitwise comparison of structures under the (possibly incorrect) assumption that they are POD.
Jon Purdy
How sure are you that individual instances of a type can have unique alignment requirements? At the very least, that would affect the object's size, which is very not allowed. I can see where it might be possible, it just doesn't seem to jive with my understanding of the alignment rules.
Dennis Zickefoose
@Dennis: well, I'm pretty sure there's nothing in the standard to specifically allow it, but offhand I can't think of anything that would prohibit it either. The only time it seems like it could be useful would be when it was used as a base class, which loosens rules about sizeof(base_subobject). Changing the alignment doesn't necessarily change the size either -- it just changes where you put the padding (at the end of a struct vs. between members) -- though it does remove most of the motivation to do so.
Jerry Coffin
Yeah, the more I think about it, the less convinced I am that there's anything forbidding it. It would just be so terribly awkward to implement, I can't imagine it actually being beneficial, except maybe as a base class.
Dennis Zickefoose
A: 

Any struct or class containing a single pointer will instantly fail any sort of meaningful comparison. Those operators will ONLY work for any class that is Plain Old Data, or POD. Another answerer correctly pointed out floating points as a case when even that won't hold true, and padding bytes.

Short answer: If this was a smart idea, the language would have it like default copy constructors/assignment operators.

DeadMG