views:

952

answers:

13

Hi all,

Consider:

class A
{
    public:
        virtual void update() = 0;
}

class B : public A
{
    public:
        void update() { /* stuff goes in here... */ }

    private:
        double a, b, c;
}

class C { /* Same kind of thing as B, but with different update function/data members */

I'm now doing:

A * array = new A[1000];
array[0] = new B();
array[1] = new C();
//etc., etc.

If i call sizeof(B), the size returned is the size required by the 3 double members, plus some overhead required for the virtual function pointer table. Now, back to my code, it turns out that 'sizeof(myclass)' is 32; that is, I am using 24 bytes for my data members, and 8 bytes for the virtual function table (4 virtual functions). My question is: is there any way I can streamline this? My program will eventually use a heck of a lot of memory, and I don't like the sound of 25% of it being eaten by virtual functions pointers.

+7  A: 

Typically, every instance of a class with at least one virtual function will have an extra pointer stored with its explicit data members.

There's no way round this but remember that (again typically) each virtual function table is shared between all instances of the class so there is no great overhead to having multiple virtual functions or extra levels of unheritance once you've paid the 'vptr tax'.

For larger classes the overhead becomes much smaller as a percentage.

If you want functionality that does something like what virtual functions do, you are going to have to pay for it in some way. Actually using native virtual functions may well be the cheapest option.

Charles Bailey
+22  A: 

The v-table is per class and not per object. Each object contains just a pointer to its v-table. So the overhead per instance is sizeof(pointer) (usually 4 bytes). It doesn't matter how many virtual function you have for the sizeof the class object. Considering I don't think you should worry too much about it.

Ponting
@280Z28: thanks for the edit..
Ponting
just to add it - there's still one v-table per class, containing at least one pointer per virtual function.
Tobias Langner
+4  A: 

You have two options.

1) Don't worry about it.

2) Don't use virtual functions. However, not using virtual functions can just move the size into your code, as your code gets more complex.

FigBug
+2  A: 

The space cost of a vtable is one pointer (modulo alignment). The table itself is not placed into each instance of the class.

Nikolai N Fetissov
By putting the vtable pointer at the head of the class there are no alignment issues (which is why most compilers place it there).
Martin York
Alignment problems are still there if your class size was less than vtable pointer to begin with. Say `struct C { char c; }` will probably be of size 1, and any array of such will have them packed tightly. If you add a virtual member, on a 32-bit platform it will likely pad the resulting 5 chars of storage to 8, so now you suddenly have 3 chars wasted per object.
Pavel Minaev
+1  A: 

If you know all of the derived types and their respective update functions in advance, you could store the derived type in A, and implement manual dispatch for the update method.

However, as others are pointing out, you are really not paying that much for the vtable, and the tradeoff is code complexity (and depending on alignment, you might not be saving any memory at all!). Also, if any of your data members have a destructor, then you also have to worry about manually dispatching the destructor.

If you still want to go this route, it might look like this:

class A;
void dispatch_update(A &);

class A
{
public:
    A(char derived_type)
      : m_derived_type(derived_type)
    {}
    void update()
    {
        dispatch_update(*this);
    }
    friend void dispatch_update(A &);
private:
    char m_derived_type;
};

class B : public A
{
public:
    B()
      : A('B')
    {}
    void update() { /* stuff goes in here... */ }

private:
    double a, b, c;
};

void dispatch_update(A &a)
{
    switch (a.m_derived_type)
    {
    case 'B':
        static_cast<B &> (a).update();
        break;
    // ...
    }
}
Stjepan Rajko
A: 

How many instances of A-derived classes do you expect? How many distinct A-derived classes do you expect?

Note that even with a million of instances, we are talking about a total of 32MB. Up to 10 millions, don't sweat it.

Generally you need an extra pointer per instance, (if you are running on an 32 bit platform, the last 4 byte are due to alignment). Each class consumes additional (Number of virtual functions * sizeof(virtual function pointer) + fixed size) bytes for its VMT.

Note that, considering alignment for the doubles, even a single byte as type identifier will bring up the array element size to 32. So Stjepan Rajko's solution is helpful in some cases, but not in yours.

Also, don't forget the overhead of a general heap for so many small objects. You may have another 8 bytes per object. With a custom heap manager - such as an object/size specific pool allocator - you can save more here and employ a standard solution.

peterchen
+1  A: 

You're adding a single pointer to a vtable to each object - if you add several new virtual functions the size of each object will not increase. Note that even if you're on a 32-bit platform where pointers are 4 bytes, you're seeing the size of the object increase by 8 probably due to the overall alignment requirements of the structure (ie., you're getting 4 bytes of padding).

So even if you made the class non-virtual, adding a single char member would likely add a full 8 bytes to the size of each object.

I think that the only ways you'll be able to reduce the size of you objects would be to:

  • make them non-virtual (you you really need polymorphic behavior?)
  • use floats instead of double for one or more data members if you don't need the precision
  • if you're likely to see many objects with the same values for the data members, you might be able to save on memory space in exchange for some complexity in managing the objects by using the Flyweight design pattern
Michael Burr
A: 

Not an answer to the question directly, but also consider that the declaration order of your data members can increase or decrease your real memory consumption per class object. This is because most compilers can't (read: don't) optimize the order in which class members are laid out in memory to decrease internal fragmentation due to alignment woes.

Chris
I'd say that they may not do it. C++ allows pointer comparisons between addresses of data members of the same object (with some restrictions), requiring the results of such comparisons to reflect the order of declaration. I'd say that any kind of "compiler magic" neecessary to support this requirement with rearranged data members would prove to be too expensive.
AndreyT
I'd counter that any two pointers are comparable arithmetically, but C++'s typing sometimes requires that you cast. Either way, the "compiler magic" that would go into such a system would be calculated to minimize the internal fragmentation of the class objects as a whole, not done on a per-instance basis. This is all assuming that I haven't misunderstood your comment.
Chris
+1  A: 

If you're going to have millions of these things, and memory is a serious concern for you, then you probably ought not to make them objects. Just declare them as a struct or an array of 3 doubles (or whatever), and put the functions to manipulate the data somewhere else.

If you really need the polymorphic behavior, you probably can't win, since the type information you'd have to store in your struct will end up taking up a similar amount of space...

Is it likely that you'll have large groups of objects all of the same type? In that case, you could put the type information one level "up" from the individual "A" classes...

Something like:

class A_collection
{
    public:
        virtual void update() = 0;
}

class B_collection : public A_collection
{
    public:
        void update() { /* stuff goes in here... */ }

    private:
        vector<double[3]> points;
}

class C_collection { /* Same kind of thing as B_collection, but with different update function/data members */
Mark Bessey
+1  A: 

Given all the answers that are already here, I think I must be crazy, but this seems right to me so I'm posting it anyways. When I first saw your code example, I thought you were slicing the instances of B and C, but then I looked a little closer. I'm now reasonably sure your example won't compile at all, but I don't have a compiler on this box to test.

A * array = new A[1000];
array[0] = new B();
array[1] = new C();

To me, this looks like the first line allocates an array of 1000 A. The subsequent two lines operate on the first and second elements of that array, respectively, which are instances of A, not pointers to A. Thus you cannot assign a pointer to A to those elements (and new B() returns such a pointer). The types are not the same, thus it should fail at compile time (unless A has an assignment operator that takes an A*, in which case it will do whatever you told it to do).

So, am I entirely off base? I look forward to finding out what I missed.

rmeador
+3  A: 

Moving away from the non issue of the vtable pointer in your object:

Your code has other problems:

A * array = new A[1000];
array[0] = new B();
array[1] = new C();

The problem you are having is the slicing problem.
You can not put an object of class B into a space the size reserved for an object of class A.
You will just slice the B(or C) part of the object clean off leaving you with just the A part.

What you want to do. Is have an array of A pointers so that it hold each item by pointer.

A** array = new A*[1000];
array[0]  = new B();
array[1]  = new C();

Now you have another problem of destruction. Ok. This could go on for ages.
Short answer use boost:ptr_vector<>

boost:ptr_vector<A>  array(1000);
array[0] = new B();
array[1] = new C();

Never allocte array like that unless you have to (Its too Java Like to be useful).

Martin York
Strictly speaking, the original version won't even compile. It attempts to assign *pointer* values to array elements, when array elements are not pointers. It is too early to claim that it has "slicing problem", until the code becomes compilable.
AndreyT
@AndretT: True. Jumped the gun.
Martin York
A: 

As others already said, in a typical popular implementation approach, once a class becomes polymorphic, each instance grows by a size of an ordinary data pointer. It doesn't matter how many virtual functions you have in your class. On a 64-bit platform the size would increase by 8 bytes. If you observed 8-byte growth on a 32-bit platform, it could have been caused by padding added to 4-byte pointer for alignment (if your class has 8-byte alignment requirement).

Additionally, it is probably worth noting that virtual inheritance can inject extra data pointers into class instances (virtual base pointers). I'm only familiar with a few implementations and in at least one the number of virtual base pointers was the same as the number of virtual bases in the class, meaning that virtual inheritance can potentially add multiple internal data pointers to each instance.

AndreyT
A: 

If you wanna really save memory of virtual table pointer in each object then you can implement code in C-style..

E.g.

struct Point2D { int x,y; };

struct Point3D { int x,y,z; };

void Draw2D(void *pThis) { Point2D *p = (Point2D *) pThis; //do something }

void Draw3D(void *pThis) { Point3D *p = (Point3D *) pThis; //do something }

int main() {

typedef void (*pDrawFunct[2])(void *);

 pDrawFunct p;
 Point2D pt2D;
 Point3D pt3D;   

 p[0] = &Draw2D;
 p[1] = &Draw3D;    

 p[0](&pt2D); //it will call Draw2D funtion
 p[1](&pt3D); //it will call Draw3D funtion
 return 0;

}

Ashish