views:

104

answers:

4

I write a desktop application that can open / edit / save documents.

Those documents are described by several objects of different types that store references to each other. Of course there is a Document class that that serves as the root of this data structure.

The question is how to save this document model into a file.

What I need:

  • Support for recursive structures.
  • It must be able to open files even if they were produced from slightly different classes. My users don't want to recreate every document after every release just because I added a field somewhere.
  • It must deal with classes that are not known at compile time (for plug-in support).

What I tired so far:

  • XmlSerializer -> Fails the first and last criteria.
  • BinarySerializer -> Fails the second criteria.

  • DataContractSerializer: Similar to XmlSerializer but with support for cyclic (recursive) references. Also it was designed with (forward/backward) compatibility in mind: Data Contract Versioning. [edit]

  • NetDataContractSerializer: While the DataContractSerializer still requires to know all types in advance (i.e. it can't work very well with inheritance), NetDataContractSerializer stores type information in the output. Other than that the two seem to be equivalent. [edit]

  • protobuf-net: Didn't have time to experiment with it yet, but it seems similar in function to DataContractSerializer, but using a binary format. [edit]

Handling of unknown types [edit]

There seem two be two philosophies about what to do when the static and dynamic type differ (if you have a field of type object but a, lets say, Person-object in it). Basically the dynamic type must somehow get stored in the file.

  • Use different XML tags for different dynamic types. But since the XML tag to be used for a particular class might not be equal to the class name, its only possible to go this route if the deserializer knows all possible types in advance (so that he can scan them for attributes).

  • Store the CLR type (class name, assembly name & version) during serialization. Use this info during deserialization to instantiate the right class. The types must not be known prior to deserialization.

The second one is simpler to use, but the resulting file will be CLR dependent (and less sensitive to code modifications). Thats probably why XmlSerializer and DataContractSerializer choose the first way. NetDataContractSerializer is not recomended because its using the second approch (So does BinarySerializer by the way).

Any ideas?

+1  A: 

I would think the XmlSerializer is your best bet. You won't be able to support everything on your requirements list without a bit of work in your Document classes - but the XmlSerializer architecture gives you extensibility points which should allow you to tap into its mechanism deep enough to do just about anything.

Using the IXmlSerializable interface - by implementing that on your classes you want to store - you should be able to do just about anything, really.

The interface exposes basically two methods - ReadXml And WriteXml

public void WriteXml (XmlWriter writer)
{
    // do what you need to do to write out your XML for this object
}

public void ReadXml (XmlReader reader)
{
    // do what you need to do to read your object from XML
}

Using these two methods, you should be able to capture the necessary state information from just about any object you might want to store, and turn it into XML that can be persisted to disk - and deserialized back into an object when the time comes!

marc_s
Ok, but how could I get recursion to work using this interface? Lets say I have some objects with reference each other: a->b->c->d->a
Lawnmower
+1  A: 

XmlSerializer can work for your first criteria, however you must provide the recursion for objects like the TreeView control.

BinaryFormatter can work for all 3 criteria. If a class changes, you may have to create a conversion tool to convert old format documents to a new format. Or recognize an older format, deserialize to the old, and then save to the new - keeping your old class format around for a little while.

This will help cover version tolerance which is what I think you're after: MSDN - Version Tolerant Serialization

dboarman
Convesion tools sounds like a clean approch. But during development, when changes happen within minutes, it seems like a lot of overhead. Is it possible to have some default behaviour? Like "Ignore fields in the file that don't fit the object" and "Add default value if a field is missing in the file".
Lawnmower
I know you have attributes like [XmlIgnore]. I'm not sure what the binary counterpart is. You will have to look at how to decorate your class and members here: http://blog.kowalczyk.info/article/Serialization-in-C.html
dboarman
I think the equivalent would be [NonSerialized]. But I don't see how this would help to load a file written by from earlier version of a class.
Lawnmower
+3  A: 

The one you haven't tried is DataContractSerializer. There is a constructor that takes a parameter bool preserveObjectReferences that should handle the first criteria.

DW
NetDataContractSerializer fits all three requirements. But the last requirement one is probabily a bad idea, since it sort of contradicts the second one (I've added a 'Handling of unknown types' about this).The DataContractSerializer seems to be the best choice.
Lawnmower
+3  A: 

The WCF data contract serializer is probably closest to your needs, although not perfect.

There is only limited support for backwards compatibility (i.e. whether old versions of the program can read documents generated with a newer version). New fields are supported (via IExtensibleDataObject), but new classes or new enum values not.

oefe