I am looking hard at the basic principles of storing the state of an executing program to disk, and bringing it back in again. In the current design that we have, each object (which is a C-level thingy with function pointer lists, kind of low-level home-made object-orientation -- and there are very good reasons for doing it this way) will be called to export its explicit state to a writable and restorable format. The key property to make this work is that all state related to an object is indeed encapsulated in the object data structures.
There are other solutions where you work with active objects, where there is a user-level thread attached to some objects. And thus, the program counter, register contents, and stack contents suddenly become part of the program state. As far as I can see, there is no good way to serialize such things to disk at an arbitrary point in time. The threads have to go park themselves in some special state where nothing is represented by the program counter et al, and thus basically "save" their execution state machine state to the explicit object state.
I have looked at a range of serialization libraries, and as far as I can tell this is a universal property.
The core quesion is this: Or is this actually not so? Are there save/restore solutions out there that can include thread state, in terms of where in its code a thread is executing?
Note that saving an entire system state in a virtual machine does not count, that is not really serializing the state, but just freezing a machine and moving it. It is an obvious solution, but a bit heavyweight most of the time.
Some questions made it clear that I was not clear enough in explaining the idea of how we do things. We are working on a simulator system, with very strict rules for code running inside it is allowed to be written. In particular, we make a complete divide between object construction and object state. The interface function pointers are recreated every time you set up the system, and are not part of the state. The state only consists of specific appointed "attributes" that each have a defined get/set function that converts between internal runtime representation and storage representation. For pointers between objects, they are all converted to names. So in our design, an object might come out like this in storage:
Object foo { value1: 0xff00ff00; value2: 0x00ffeedd; next_guy_in_chain: bar; }
Object bar { next_guy_in_chain: null; }
Linked lists are never really present in the simulation structure, each object represents a unit of hardware of some kind.
The problem is that some people want to do this, but also have threads as a way to code behavior. "Behavior" here is really mutation of the state of the simulation units. Basically, the design we have says that all such changeds have to be made in atomic complete operations that are called, do their work, and return. All state is stored in the objects. You have a reactive model, or it could be called "run to completion", or "event driven".
The other way of thinking about this is to have objects have active threads working on them, which sit in an eternal loop in the same way as classic Unix threads, and never terminate. This is the case that I am trying to see if it can be reasonable stored to disk, but it does not seem like that is feasible without interposing a VM underneath.
Update, October 2009: A paper related to this was published at the FDL conference in 2009, see http://www.engbloms.se/jakob%5Fpublications.html, the paper about checkpointing and SystemC.