views:

141

answers:

4

When using multiple inheritance C++ has to maintain several vtables which leads to having "several views" of common base classes.

Here's a code snippet:

#include "stdafx.h"
#include <Windows.h>

void dumpPointer( void* pointer )
{
    __int64 thisPointer = reinterpret_cast<__int64>( pointer );
    char buffer[100];
   _i64toa( thisPointer, buffer, 10 );
    OutputDebugStringA( buffer );
    OutputDebugStringA( "\n" );
}

class ICommonBase {
public:
    virtual void Common() = 0 {}
};

class IDerived1 : public ICommonBase {
};

class IDerived2 : public ICommonBase {
};

class CClass : public IDerived1, public IDerived2 {
public:
    virtual void Common() {
        dumpPointer( this );
    }
    int stuff;
};

int _tmain(int argc, _TCHAR* argv[])
{
    CClass* object = new CClass();
    object->Common();
    ICommonBase* casted1 = static_cast<ICommonBase*>( static_cast<IDerived1*>( object ) );
    casted1->Common();
    dumpPointer( casted1 );

    ICommonBase* casted2 = static_cast<ICommonBase*>( static_cast<IDerived2*>( object ) );
    casted2->Common();
    dumpPointer( casted2 );

    return 0;
}

it produces the following output:

206968 //CClass::Common this
206968 //(ICommonBase)IDerived1::Common this
206968 //(ICommonBase)IDerived1* casted1
206968 //(ICommonBase)IDerived2::Common this
206972 //(ICommonBase)IDerived2* casted2

here casted1 and casted2 have different values which is reasonable since they point to different subobjects. At the point when the virtual function is called the cast to the base class has been done and the compiler doesn't know that it was a most derived class originally. Still this is the same each time. How does it happen?

+1  A: 

If you see the object layout in memory for this it will be something like this:

v-pointer for IDerived1
v-pointer for IDerived2
....
....

It can be otherway also..but just to give an idea..

Your this will always point to the start of the object i.e. where the v-pointer for IDerived1 is stored. However, when you cast the pointer to IDetived2 the casted pointer will be pointing to the v-pointer for IDerived2 which will be offset by sizeof(pointer) from this pointer.

Naveen
+3  A: 

When casted to a different type, the offsets of fields as well as entries in the vtable have to be in a consistent place. Code that takes an ICommonBase* as a parameter doesn't know that your object is really an IDerived2. Yet it still should be able to dereference ->foo or call the virtual method bar(). If these aren't at predictable addresses that has no way of working.

For the single inheritance case, this is easy to get right. If Derived inherits from Base, you can just say that offset 0 of Derived is also offset 0 of Base, and the members unique to Derived can go after the last member of Base. For multiple inheritance obviously that can't work because the first byte of Base1 can't also be the first byte of Base2. Each one needs its own space.

So if you had such a class that inherits from two (call it Foo), the compiler can know that for the type Foo, the Base1 part starts at offset X, and the Base2 part starts at offset Y. When casting to either type, the compiler can just add the appropriate offset to this.

When an actual virtual method of Foo is called, where the implementation is provided by Foo, it still needs the "real" pointer to the object, so that it can access all of its members, not just the particular instance of the base Base1 or Base2. Hence this still needs to point to the "real" object.

Note that the implementation details of this may be different than described, this is just a high level description of why the problem exists.

asveikau
+1  A: 

There's the same object model in G++ 4.3 as shown by you, (see answer of Naveen) say casted1 and casted2 have different values.

Furthermore, in G++ 4.3, even you use brutal casting:

ICommonBase* casted1 = (ICommonBase*)(IDerived1*)object; 
ICommonBase* casted2 = (ICommonBase*)(IDerived2*)object;

the result is the same.

Very clever the compiler

EffoStaff Effo
Not really "brutal" casting, they're `static_cast`s. Try `reinterpret_cast` if you want to see it fail.
MSalters
+3  A: 

When multiple inheritance is used in a virtual function call, the call to the virtual function will often go to a 'thunk' that adjusts the this pointer. In your example, the casted1 pointer's vtbl entry doesn't need a thunk becuase the IDerived1 sub-object of the CClass happens to coincide with the start of the CClass object (which is why the casted1 pointer value is the same as the CClass object pointer).

However, the casted2 pointer to the IDerived2 sub-object doesn't coincide with the start of the CClass object, so the vtbl function pointer actually points to a thunk instead of directly to the CClass::Common() function. The thunk adjusts the this pointer to point to the actual CClass object then jumps to the CClass::Common() function. So it will always get a pointer to the start of the CClass object, regardless of which type of sub-object pointer it might have been called from.

There's a very good explanation of this in Stanley Lippman's "Inside the C++ Object Model" book, section 4.2 "Virtual Member Functions/Virtual Functions Under MI".

Michael Burr
As an aside - this is one possible implementation of virtual functions with MI, but it's a common one, and the one apparently used by MSVC. However, even if this particular implementation isn't specified by the standard, it's still useful to know how this type of operation might work.
Michael Burr
@Michael, I understand the description generic enough not to collide with any possible implementation. Pointers into subobjects will not match, and before calling the virtual method the system must adapt the `this` pointer so that when passed into the method it does refer to the beginning of the (sub)object where that method is implemented. How this is done, and what would the 'thunk' look like are implementation defined. (Also I upvoted for the idea, I don't think that the vtable points to the thunk, but rather has info so that the pointer can be adjusted by the compiler at the place of call.
David Rodríguez - dribeas
`this` is traditionally passed as the first (hidden) parameter to a method call bound to an object. So it's more like the method of the vtable is passed the 'thunk' > there is only one vtable for all instances of the same classes so doing otherwise would be playing with fire in Multi Threaded environments.
Matthieu M.
@dribeas - having an adjustment in the vtbl entry is another way this can be implemented (and according to Lippman is how cfront did it), but it's not how MSVC does it (or IAR's compiler). If you think about it, a vtbl pointing to a thunk is logically equivalent to a vtbl with an adjustment offset in it. The 'thunk' variant has the advantage that there's no overhead for the common case of no adjustment necessary (the thunk is a no-op, so the vtbl can point to the actual function). Cfront used used adjustment offsets in the vtbl because it's difficult to implement thunks in C.
Michael Burr
@Matthieu - there's nothing that says a class must have only one vtbl instance. But in the common case of single inheritance there's no reason to have multiple copies. There's no need to update vtbl's - they can be statically initialized at program start (they can be in ROM), so there is no issue with multi-threading. In this scheme a plain-old `IDerived2` object will point to one vtbl, and a `IDerived2` object that's a super-class (or sub-object) of a `CClass` will point to a different vtbl.
Michael Burr