I have an application which saves documents (think word documents) in an Xml based format - Currently C# classes generated from xsd files are used for reading / writing the document format and all was well until recently when I had to make a change the format of the document. My concern is with backwards compatability as future versions of my application need to be able to read documents saved by all previous versions and ideally I also want older versions of my app to be able to gracefully handle reading documents saved by future versions of my app.
For example, supposing I change the schema of my document to add an (optional) extra element somewhere, then older versions of my application will simply ignore the extra elemnt and there will be no problems:
<doc>
<!-- Existing document -->
<myElement>Hello World!</myElement>
</doc>
However if a breaking change is made (an attribute is changed into an element for example, or a collection of elements), then past versions of my app should either ignore this element if it is optional, or inform the user that they are attempting to read a document saved with a newer version of my app otherwise. Also this is currently causing me headaches as all future versions of my app need entirely separate code is needed for reading the two different documents.
An example of such a change would be the following xml:
<doc>
<!-- Existing document -->
<someElement contents="12" />
</doc>
Changing to:
<doc>
<!-- Existing document -->
<someElement>
<contents>12</contents>
<contents>13</contents>
</someElement>
</doc>
In order to prevent support headaches in the future I wanted to come up with a decent strategy for handling changes I might make in the future, so that versions of my app that I release now are going to be able to cope with these changes in the future:
- Should the "version number" of the document be stored in the document itself, and if so what versioning strategy should be used? Should the document version match the .exe assembly version, or should a more complex strategy be used, (for example major revision changed indicate breaking changes, wheras minor revision increments indicate non-breaking changes - for example extra optional elements)
- What method should I use to read the document itself and how do I avoid replicating massive amounts of code for different versions of documents?
- Although XPath is obviously most flexible, it is a lot more work to implement than simply generating classes with xsd.
- On the other hand if DOM parsing is used then a new copy of the document xsd would be needed in source control for each breaking change, causing problems if fixes ever need to be applied to older schemas (old versions of the app are still supported).
Also, I've worked all of this very loosly on the assumption that all changes I make can be split into these two categories of "beaking changes" and "nonbreaking changes", but I'm not entirely convinced that this is a safe assumption to make.
Note that I use the term "document" very loosely - the contents dont resemble a document at all!
Thanks for any advice you can offer me.