views:

303

answers:

6

Why isn't every type of object implicitly serializable?

In my limited understanding, are objects not simply stored on the heap and pointers to them on the stack?

Shouldn't you be able to traverse them programatically, store them in a universal format and also be able to reconstruct them from there?

+15  A: 

Some objects encapsulate resources like file pointers or network sockets that can't be deserialized to the state they were in when you serialized the object that contained them.

Example: you shouldn't deserialize an object that serves as an authenticated database connection, because to do so, you'd need the serialized form to contain a plaintext password. This would not be a good practice, because someone might get a hold of the saved serialized form. You also have no idea when you deserialize that the database server is still running, can be accessed, the authentication credentials still valid, etc.

Bill Karwin
+3  A: 

No, because sometimes you don't have all the information in the place that you reconstruct them. Remember that you may not be reconstructing the object in the same context as where you had it; it may be a different machine or even different language.

Noon Silk
+4  A: 

Even if you only consider objects that don't include OS state, the problem is harder than it looks at first glance. The graph may have cycles. Entities may be referenced from multiple top-level entities.

I tried to outline a universal serialization library in c in a previous answer, and found that there are some hard cases.

dmckee
Right, there can be cycles and so on, but these are solved problems. Contrast this with the meaninglessness of serializing a window handle or a file handle.
Steven Sudit
@Steven Sudit: I would characterize them as "known to be solvable" rather than "solved". There are decisions that have to be made: trade-offs between how much you serialize now, and how identical the deserialized structure is later. Different problems may call for different choices. So the problems have solutions, but no single solution will do for all cases. That's what makes the problem hard.
dmckee
The problem is hard, but others have already solved it for us. In .NET, for example, there's a perfectly workable serialization system that handles cycles without a hitch. There are comparable solutions in other platforms, as well. However, what none of them can ever do is serialize things like handles. This is a deeper issue.
Steven Sudit
@Steven Sudit: You may have missed my point. There *are* trade off to be made in constructing a general serializer. It's fine that the .NET people or the ROOT folks have built them. But what if my situation calls for a different set of trade offs? The problem is always solvable, but is no single solution is always applicable.
dmckee
@dmckee: I think we may be talking past each other, rather than disagreeing.. I don't deny that there are issues and complexities to general serialization. My point was that these are, both in principal and practice, solvable, or at least resolvable. On the other hand, there are types of resources that serialization is simply not meaningful for, and this will not ever change. We will never be able to use Hibernate or XmlSerializer or whatever on a mutex handle becaause its value is meaningless when the mutex goes away.
Steven Sudit
A: 

Technically, any object in a memory space can be serialized and persisted to a durable medium like a hard drive. After all most OSes page active memory to and from disk and many also have a hibernate style feature. The problem is one of scope, for example: you create a string object in your memory space, its yours to serialize and deserialize as you see fit. When you open a file, the OS gives you a file handle, but the OS still owns the file system containing the actual file object you have a handle to. A file system driver on the other hand has to maintain a persistent database of file handles and other file related metadata.

Joe Caffeine
+1  A: 

How much sense would it make to serialize an object that contains a network connection and is responsible for streaming data back from a web server?

What about deserializing it, how would that work? Should it reopen the connection, redownload the file?

Lasse V. Karlsen
A: 

You are right in your assumptions, in a way.

It must be possible to partition the set of all objects in the program into groups

1) You have complete information which allows complete deconstruction and reconstruction of the object. Arrays of numbers or strings, structs are good examples.

2) You have construction information. You can reconstruct the object by calling external code. A file is a good example, but it requires that your program has a file abstraction that remembers the construction and state parameters. We can for example save the path to the file and the position in the file. However reconstruction might fail. (For example, the file was deleted or changed)

3) You have no construction information, the object was somehow randomly received.

Here, to be able to serialize the objects completely, we have to go from 3) to 2) to 1). Objects in 3) can be attributes of an object of type 2), and can be retrieved by successfully reconstructing a type 2) object.

A type 2) object however, must be reconstructed by serializing just construction information, which has to be of type 1), for example numbers and strings, true data.

This whole scheme seems costly since if we want to reconstruct the whole program state, we have to work with abstractions that encapsulate objects of type 2). And we have to know what we do when an object cannot be reconstructed. Also, we must be sure that we don't mix objects of these types, that we don't mix in objects of type 3 or 2 where we expect to collect just objects of type 1.

kaizer.se