views:

121

answers:

3

Firstly, I know that writing a class to disk is bad, but you should see some of our other code. D:

My question is: can I write a polymorphic class to disk and then read it in later and not get undefined behaviour? I am going to guess not because of vtables (I think these are generated at runtime and unique to the object?)

I.e.

class A {
    virtual ~A() {}
    virtual void foo() = 0;
};

class B : public A {
    virtual ~B() {}
    virtual void foo() {}
};

A * a = new B;

fwrite( a, 1, sizeof( B ), fp );

delete a;

a = new B;

fread( a, 1, sizeof( B ), fp );

a->foo();

delete a;

Thank-you!

A: 

The problem is not the vtable. It is stored per class type, not per instance, so you won't write it to file. Basically your code should work (haven't tried it).

However, you should keep in mind that reading pointers/handles from file does not work.

No it would not work as location of vtable may change
Artyom
+1  A: 

I'll suggest you to take a look at Boost Serialization.

we use the term "serialization" to mean the reversible deconstruction of an arbitrary set of C++ data structures to a sequence of bytes. Such a system can be used to reconstitute an equivalent structure in another program context. Depending on the context, this might used implement object persistence, remote parameter passing or other facility.

karlphillip
Ill take it from other comments that it isnt possible. Would be very nice to use boost, but i doubt its going to happen
A: 

You might be able to get away with it if such objects are always read back during the same execution of the program that wrote them (though I really don't recommend it). But if the data in the file must persist between different executions of the program, then using the raw bytes of the in-memory objects will almost certainly lead to significant problems.

Each vtable itself is generated at compile time and stored somewhere in the resulting executable. What each object instance contains is just a pointer to the appropriate vtable, and that pointer does not change for the lifetime of any given object. (Multiple inheritance can be a little more complicated, but for this discussion those details aren't relevant. The pointers are still constant.)

So if an object has a vtable pointer and you write the raw bytes of that object to disk, then the vtable pointer is written to disk as well. If you then read back those bytes during the same execution of the program and push them into an appropriate object, it may work since the vtable will still be in the same location and thus the vtable pointer will still be correct.

(However note that everything I just explained there is an implementation detail. While many compilers typically implement virtual functions in that manner, I don't think any of the exact details are guaranteed by the C++ standard. So there could be additional potential problems.)

Now, if this might be possible, why not store such objects for longer durations? Because you have no guarantee that any particular virtual table will be in the same memory location.

Some operating systems may change the memory layout for each execution of the same program. I don't know whether or not this actually affects virtual table locations, but that's certainly a serious risk.

Furthermore, if you ever compile a new version of the program, the location of each virtual table is completely up to the whims of the compiler. Changes to seemingly unrelated parts of the code may cause the compiler to place the relevant virtual tables in different locations. Obviously, that happening would completely break this scheme. And you have no way to prevent it from happening.

(And beyond the vtables, what if new data members need to be added to those objects in subsequent versions of the program? You might have to deal with reading past versions of raw objects' bytes into new versions that have new members or a different layout of members. That can get complicated and ugly as well as error prone.)


Now, even if you only intend to store the objects temporarily for each execution of the program. I still don't think it's a good idea. You are highly restricted as to what kinds of variables these objects can contain. No smart objects (std::string, std::vector, etc). No pointers to memory allocated per each object. Any strings must therefore be stored in raw character arrays. Other dynamic allocation would have to be turned into fixed members or member arrays. That means you lose a lot of C++'s benefits everywhere these objects are used.

Furthermore, these objects and the scheme of writing this directly to disk would need to be accompanied by comments and documentation warning of all the dangers I've described. Otherwise, some future programmer might unknowingly decide to add the wrong kind of data member. Or even worse, they might decide to try storing such objects longer than the execution of the program, opening them up to serious crashes and failures that might not happen until much later in the future (and probably at the worst possible time).


In the end, I strongly suggest using a scheme that stores the data in a format specifically intended for the file. As someone else already mentioned, Boost Serialization is a good option. If not that, there are may be other usable serialization libraries. Or else depending on your needs, you may be able to roll your own mechanism without too much trouble.

TheUndeadFish