views:

68

answers:

2

We use BinaryFormatter in a C# game, to save user game progress, game levels, etc. We are running into the problem of backwards compatibility.

The aims:

  • Level designer creates campaign (levels&rules), we change the code, the campaign should still work fine. This can happen everyday during development before release.
  • User saves game, we release a game patch, user should still be able to load game
  • The invisible data-conversion process should work no matter how distant the two versions are. For example an user can skip our first 5 minor updates and get the 6th directly. Still, his saved games should still load fine.

The solution needs to be completely invisible to users and level designers, and minimally burden coders who want to change something (e.g. rename a field because they thought of a better name).

Some object graphs we serialize are rooted in one class, some in others. Forward compatibility is not needed.

Potentially breaking changes (and what happens when we serialize the old version and deserialize into the new):

  • add field (gets default-initialized)
  • change field type (failure)
  • rename field (equivalent to removing it and adding a new one)
  • change property to field and back (equivalent to a rename)
  • change autoimplemented property to use backing field (equivalent to a rename)
  • add superclass (equivalent to adding its fields to the current class)
  • interpret a field differently (e.g. was in degrees, now in radians)
  • for types implementing ISerializable we may change our implementation of the ISerializable methods (e.g. start using compression within the ISerializable implementation for some really large type)
  • Rename a class, rename an enum value

I have read about:

My current solution:

  • We make as many changes as possible non-breaking, by using stuff like the OnDeserializing callback.
  • We schedule breaking changes for once every 2 weeks, so there's less compatibility code to keep around.
  • Everytime before we make a breaking change, we copy all the [Serializable] classes we use, into a namespace/folder called OldClassVersions.VersionX (where X is the next ordinal number after the last one). We do this even if we aren't going to be making a release soon.
  • When writing to file, what we serialize is an instance of this class: class SaveFileData { int version; object data; }
  • When reading from file, we deserialize the SaveFileData and pass it to an iterative "update" routine that does something like this:

.

for(int i = loadedData.version; i < CurrentVersion; i++)
{
    // Update() takes an instance of OldVersions.VersionX.TheClass
    // and returns an instance of OldVersions.VersionXPlus1.TheClass
    loadedData.data = Update(loadedData.data, i);
}
  • For convenience, the Update() function, in its implementation, can use a CopyOverlappingPart() function that uses reflection to copy as much data as possible from the old version to the new version. This way, the Update() function can only handle stuff that actually changed.

Some problems with that:

  • the deserializer deserializes to class Foo rather than to class OldClassVersions.Version5.Foo - because class Foo is what was serialized.
  • almost impossible to test or debug
  • requires to keep around old copies of a lot of classes, which is error-prone, fragile and annoying
  • I don't know what to do when we want to rename a class

This should be a really common problem. How do people usually solve it?

+1  A: 

We got the same problem in our application with storing user profile data (grid column arrangement, filter settings ...).

In our case the problem was the AssemblyVersion.

For this problem i create a SerializationBinder which reads the actual assembly version of the assemblies (all assemblies get a new version number on new deployment) with Assembly.GetExecutingAssembly().GetName().Version.

In the overriden method BindToType the type info is created with the new assembly version.

The deserialization is implemented 'by hand', that means

  • Deserialize via normal BinaryFormatter
  • get all fields which have to be deserialized (annotated with own attribute)
  • fill object with data from the deserialized object

Works with all our data and since three or four releases.

K.Hoffmann
+1  A: 

Tough one. I would dump binary and use XML serialization (easier to manage, tolerant to changes that are not too extreme - like adding / removing fields). In more extreme cases it is easier to write a transform (xslt perhaps) from one version to another and keep the classes clean. If opacity and small disk footprint are a requirement you can try to compress the data before writing to disk.

AZ
BinarySerialization *is* version-tolerant for small changes, including adding/removing fields. What exactly do you mean by "easier to manage"? XSLT sounds like a great solution, actually. And no, file size and performance aren't an issue.
Stefan Monov
by "easier to manage" i'm referring to the human readable nature of the XML format and to the decoupling of the data representation from the class structure (you can control how the serialized XML can look like through attributes and the structure does not need to mirror the actual class)
AZ