views:

311

answers:

9

I have two pointers to objects and I want to test if they are the exact same object in the most robust manner. I explicitly do not want to invoke any operator == overloads and I want it to work no matter what base classes, virtual base classes and multiple inheritance is used.

My current code is this:

((void*)a) == ((void*)b)

And for my case this works. However, that doesn’t work for this case:

class B1 {};
class B2 {};
class C : public B1, public B2 {}

C c;
B1 *a = &c;
B2 *b = &c;

Subbing in reinterpert_cast, static_cast or dynamic_cast doesn't work either.


Particularly I'm hoping for something that ends up really simple and efficient. Ideally it wouldn't require any branch instructions to implement and would do something like, adjust the pointer to the start of the object and compare.

+1  A: 

Except for smart pointers (which aren't really pointers, but class objects), overloading operator== isn't possible for pointers, so a cast shouldn't be necessary.

Of course, comparing pointers of different types might not work. Why do you think you need to do that?

sbi
I'm guessing they're trying to compare two interface pointers to see if they're pointing to the same object.
Mark Ransom
I'm looking for a boilerplate expression that I can use in any situation.
BCS
A: 

You could check to point if the objects pointed to overlap in memory (stealing from Mark Ransom's answer). This invokes undefined behavior, but should do what you want on any reasonable compiler:

template <typename T1, typename T2>
bool is(const T1 *left, const T2 * right)
{
    const char *left_begin=reinterpret_cast<const char*>(left);
    const char *left_end=left_begin+sizeof(*left);

    const char *right_begin=reinterpret_cast<const char*>(right);
    const char *right_end=right_begin+sizeof(*right);

    return ( left_begin  <= right_begin && right_begin < left_end) ||
           ( right_begin <= left_begin  && left_begin < right_end);
}
Managu
I don't think that would work if the references are to independent base classes as they will (I think) be adjacent rather than overlapping
BCS
This returns false for me in Visual Studio 2005 for the example given. The values are: left_begin = 1245027; left_end = 1245028; right_begin = 1245028; right_end = 1245029
garethm
This might also have problems with multiple inheritance.
Managu
A: 

Your approach worked for me even in your cited case:

class B1 {};
class B2 {};
class C : public B1, public B2 {};

int main() {
  C c;
  B1 *a = &c;
  B2 *b = &c;

 if ((void*)a == (void*)b) {
  printf("equal");
 }
 else {
  printf("not equal!");
 }

}

prints "equal" here..

Tarnschaf
You need virtual functions in B1 and B2 to see the difference.
Mark Ransom
what are you running on? I get that on GCC but it fails under DMC (from digitalmars)
BCS
g++ did empty base class optimization there, apparently. Just stick a `char` field in both `B1` and `B2`.
Pavel Minaev
+3  A: 

There's no general way to do it. Base class subobjects, in general, have no knowledge that they are that, so if you only have a pointer to a base class subobject, you do not have any means to obtain a pointer to the most derived object to which it belongs, if you do not know the type of the latter in advance.

Let's start with this:

 struct B1 { char b1; };
 struct B2 { char b2; };
 struct D : B1, B2 { char d; };

 // and some other type...
 struct Z : B2, B1 { };

Consider a typical implementation of in-memory layout of D. In the absence of vtable, the only thing we have is the raw data (and possibly padding):

       Offset    Field
       ------    -----
    /       0    b1     >- B1
 D-<        1    b2     >- B2
    \       2    d

You have two pointers:

B1* p1;
B2* p2;

Each pointer effectively points at a single char within an instance of D. However, if you do not know that in advance, how could you tell? There's also a possibility that pointers could rather point to subobjects within an instance of Z, and looking at pointer values themselves, there's clearly no way to tell; nor is there anything you (or the compiler) can deduce from data referenced by the pointer, since it's just a single byte of data in the struct.

Pavel Minaev
I'd say that in essence, the problem here is that in C++ `(B1*)c` and `(B2*)c` aren't the same object, as far as those types know, but the questioner wants to define "same object" to mean that they are (because class C knows they are). I guess you could define a virtual function GetMostDerivedPtr in all your interfaces, and have every class implement it (perhaps using a CRTP helper) to return `(void*)this` (`(void*)static_cast<T*>(this)` with the helper).
Steve Jessop
Well, they are base class subobjects of the same most derived object, and the meaning of that is well defined in ISO C++... it's just not something that can be done within the constraints given.
Pavel Minaev
The compiler does know in advance which subobject each pointer goes to, supposing that they are subobjects. If you cast both of those pointers to a D they will come out equal. The only problem is introducing some type to which both may be cast. This may be done with either a virtual base or template metaprogramming.
Potatoswatter
That's what I mean. Offset of base class subobject is compile-time info and is not persisted at runtime, so when all you have is a pointer (and he specifically mentions in the question that you can't have vtable etc - just a pointer to any random class), there's nowhere to pull that info out.
Pavel Minaev
He doesn't say you can't add a vtable, rather he says that he wants it to work with any preexisting inheritance structure that may or may not include one. His only specification is no branch instructions while comparing.
Potatoswatter
A: 

So, you're looking for a compile time solution. I don't believe this is possible as stated in C++. Here's a thought experiment:

File Bases.hpp:

class B1 {int val1;};
class B2 {int val2;};

File Derived.hpp:

#include <Bases.hpp>
class D : public B1, public B2 {};

File Composite.hpp:

#include <Bases.hpp>
class C
{
   B1 b1;
   B2 b2;
};

File RandomReturn.cpp:

#include <Composite.hpp>
#include <Derived.hpp>
#include <cstdlib>

static D derived;
static C composite;

void random_return(B1*& left, B2*& right)
{
    if (std::rand() % 2 == 0)
    {
        left=static_cast<B1*>(&derived);
        right=static_cast<B2*>(&derived);
    }
    else
    {
        left=&composite.b1;
        right=&composite.b2;
    }
}

Now, suppose you have:

#include <Bases.hpp>
#include <iostream>

extern void random_return(B1*& , B2*& );

// some conception of "is_same_object"    
template <...> bool is_same_object(...) ...

int main()
{
    B1 *left;
    B2 *right;

    random_return(left,right);
    std::cout<<is_the_same_object(left,right)<<std::endl;
}

How could we possibly implement is_same_object here, at compile time, without knowing anything about class C and class D?

On the other hand, if you're willing to change the hypotheses, it should be workable:

class base_of_everything {};
class B1 : public virtual base_of_everything {};
class B2 : public virtual base_of_everything {};

class D : public B1, public B2, public virtual base_of_everything {};

...
// check for same object
D d;
B1 *b1=static_cast<B1*>(&d);
B2 *b2=static_cast<B2*>(&d);

if (static_cast<base_of_everything*>(b1)==static_cast<base_of_everything*>(b2))
{
    ...
}
Managu
What's `virtual class`? ;)
Pavel Minaev
oh, did I get that wrong. I don't do virtual inheritance much.
Managu
+8  A: 

If your classes are genuinely exactly as given then it's impossible as there's not enough information available at runtime to reconstruct the required information.

If they're actually polymorphic classes, with virtual functions, it sounds like dynamic_cast<void *> is the answer. It returns a pointer to the most derived object. Your check would then be dynamic_cast<void *>(a)==dynamic_cast<void *>(b).

See paragraph 7 here:

http://www.csci.csusb.edu/dick/c++std/cd2/expr.html#expr.dynamic.cast

I suspect the usual dynamic_cast issues apply -- i.e., no guarantee it will be quick, and your classes will have to be polymorphic.

This is not a feature I have used myself, I'm afraid -- but I have seen it suggested often enough by people who have that I infer it is widely-supported and works as advertised.

brone
Yup, it would still require the class to be polymorphic (i.e. have vtable).
Pavel Minaev
But that's the price to pay, i would say. And indeed, if he really has multiple inheritance like that, it sounds like he's got polymorphic classes anyway. If not, a simple `virtual ~B1() {}` and same for `B2` will be all fine. +1 indeed - dammit i saw the red bar of danger while writing something like this answer xD
Johannes Schaub - litb
I'll accept the no solution in the non virtual case and this seems to be the best in the virtual case.
BCS
+3  A: 

There's an easy way and a hard way.

The easy way is to introduce an empty virtual base class. Every object inheriting from such a class gets a pointer to the common point in the "real" object, which is what you want. The pointer has a little overhead but there are no branches or anything.

class V {};
class B1 : public virtual V {}; // sizeof(B1) = sizeof(void*)
class B2 : public virtual V {}; // sizeof(B2) = sizeof(void*)
class D : public B1, public B2 {}; // sizeof(D) = 2*sizeof(void*)

bool same( V const *l, V const *r ) { return l == r; }

The hard way is to try to use templates. There are a few hacks here already… when hacking with templates remember that you are essentially reinventing part of the language, just with hopefully lower overhead by managing compile time information. Can we lower the overhead of the virtual base class and eliminate that pointer? It depends on how much generality you need. If your base classes can be arranged in several different ways within a derived object, then there is certainly information that you can't get at compile time.

But if your inheritance hierarchy is an upside-down tree (ie, you are building large objects by lots of multiple inheritance), or several such trees, you can just go ahead and cast the pointers to the most derived type like this:

class C; // forward declare most derived type
class T { public: typedef C base_t; }; // base class "knows" most derived type
class B1: public T { int a; };
class B2: public T { int b; };
class D: public B1, public B2 { int c; };

 // smart comparison function retrieves most-derived type
 // and performs simple addition to make base pointers comparable
 // (if that is not possible, it fails to compile)
template< class ta, class tb >
bool same( ta const *l, tb const *r ) {
        return static_cast< typename ta::base_t const * >( l )
         == static_cast< typename tb::base_t const * >( r );
}

Of course, you don't want to pass NULL pointers to this "optimized" version.

Potatoswatter
Nice writeup, but you didn't answer his question. Quote: " I explicitly do not want to invoke any operator == overloads and I want it to work no matter what base classes, virtual base classes and multiple inheritance is used.". I take it to mean that he wants to be able to take any two pointers to any two random types - which he doesn't know nor can modify - and have them compared for equality.
Pavel Minaev
I guess, the most I can do is show what's possible and explain what's not… I don't think he's yet really specified that he can't modify the classes. His words "are the same object" don't really apply to his question, as two base classes may be *in* the same object but will have entirely exclusive interfaces—and hence be totally different objects—unless they share a virtual base.
Potatoswatter
@Pavel: As a matter of fact, I interpreted this as: I don't want to rely on something out of my control, i.e. the way the compiler uses the address space. There are barely any guarantees on that in the standard, I guess.
xtofl
A: 

Use boost::addressof. I'm thinking this is the best way. boost::addressof is provided to get the address anyway, regardless of potential uses and misuses of operator overloading. By using some clever internal machinery, the template function addressof ensures that it gets to the actual object and its address. Look at this

#include "boost/utility.hpp"

class some_class {};

int main() {
  some_class s;
  some_class* p=boost::addressof(s);
}
Davit Siradeghyan
Potatoswatter
A: 

If you need to compare the identity of your objects, why wouldn't you give them one? After all, it's you who decides what makes the identity of the object. Let the compiler do it, and you're bound to the compiler's limitations.

Something in the way of...

class identifiable {
    public:
    long long const /*or whatever type*/ identity;
    static long long sf_lFreeId() { 
       static long long lFreeId = 0;
       return lFreeId++; // not typesafe, yet
    }
    identifiable(): identity( sf_lFreeId() ) {}
    bool identical( const identifiable& other ) const { 
      return identity == other. identity;
    }
};

class A : public identifiable {
};

....

A a1, a2;
A& a3 = a1;

assert( !a1.identical(a2) );
assert( a1.identical( a3 ) );
xtofl
There's no reason to put a unique ID inside an object when a pointer is already a unique value that consumes no memory—and you need the pointer to retrieve the ID. Also, you should use size_t for such ID's: it is the correct size on all platforms and "long long" is nonstandard.
Potatoswatter
@Potatoswatter: An object can reside on different computers (i.e. CORBA, JRI, COM+...), so it's identity is not limited, nor defined by the address space you see it in. The question already states that this vision has limitations with multiple inheritance. It has others, too.
xtofl