views:

60

answers:

1

I'm creating a kind of wiki site and have a data structure which consists of a parent entity which has a one to many relationship to a child entity. I'm currently storing this in two tables in the database, with a foreign key relationship from the child entity table to the parent entity table.

I need to version this data structure in my database for the wiki functionality. This would be straight forward if it were just one table, but since it's two it's a bit more tricky. I was thinking of removing the child entity table and storing them as XML data in a field of the parent entity table. I don't need to search by these child entities. Is this a good or bad idea? Are there any better ways to store my data?

+1  A: 

I was thinking of removing the child entity table and storing them as XML data in a field of the parent entity table

Every time you change a parent you will create a copy of the children: if the parent data changes a lot, then you will eat up your storage quite quickly. Also, storing all the child changes within a single parent record will tend to bloat the parent (as having lots of children tends to), which might have implications for the performance of data retrieval.

There are many different ways of storing versioned data, but they fall into two categories:

  • current version in one store, historical versions in another store
  • all versions in a single store

Choosing between these approaches depends on what you want to do with the historical versions. Are they there just for rollback? Or will users be regularly browsing changes?

A further complication which doesn't always apply but which I think might be pertinent here (because of your suggested solution) is the relationship between parent and child changes: when you show a historical version of the parent do you need to show the appropriate versions of the child? And vice versa, if you show an old version of a child, do you need to show the appropriate versions of its parent (and siblings)?

The key thing you need to establish is: what are most users going to do most of the time? Prioritise that. If the most common use case is look at just the current versions of all records then you should go for separate tables for storing the versions.

If you need to record the child versions which applied to a given parent version then you could introduce an additional framework table (parent_id, parent_version, child_id, child_version). Maintaining this table is not too onerous when you only have tables for one parent and one child. Where it gets gnarly is when you have multiple child you need to track.

APC