views:

205

answers:

2

I found out that RCS for models is an interesting problem to solve in the context of data persistence. They are several solution using the django ORM to achieve this django-reversion and AuditTrail each of which propose their own way to do it.

Here is the model (in django-model-like format) which I would like to have revisions :

class Page(Model):

    title = CharField()
    content = TextField()
    tags = ManyToMany(Tag)
    authors = ManyToMany(Author)
  • Each revision should be annotated with a date, a revision number, a comment and the user that did the modification.

How would you do it in you preferred db (Mongo, neo4j, CouchDb, GAE Datastore) ?

Please post only one example of RCS models per post.

I'm not asking for a complete code (maybe an explanation is enough?) but enough to see how this problem can be tackled in each db type.

A: 

In CouchDB this is rather straightforward. Every item in the DB has a _id and a _rev. So you don't need a separate revision number. I would probably do this then. Assign every item a systemrev number. This number would be a link to another DB record containing the date, comment and user for that revision.

Examples:

item being tracked:

{
     _id: "1231223klkj123",
     _rev: "4-1231223klkj123",
     systemRev: "192hjk8fhkj123",
     foo: "bar",
     fooarray: ["bar1", "bar2", bar3"]
}

And then create a separate revision record:

{
    _id: "192hjk8fhkj123",
    _rev: "2-192hjk8fhkj123",
    user: "John", 
    comment: "What I did yesterday",
    date: "1/1/2010",
    tags: ["C# edits", "bug fixes"]
}

To me it seems pretty elegant....

Timothy Baldridge
+1  A: 

First of all, if you are using CouchDB, do not use the _rev field.

Why? Old revisions are lost when a database is compacted.

Compaction rewrites the database file, removing outdated document revisions and deleted documents.

CouchDB wiki - Compaction page

There are a couple possible solutions:

  1. Keep current and old revisions in the same database. Add an extra revision field to determine the difference between current and old revisions.
  2. Store old revisions in a separate database. When a new revision is added to the "current" database, the old revision document can be deleted and inserted into the "revisions" database.

Which one is best? It depends on how your data is going to be accessed. If you can query the old revisions independently from the current revisions, then storing the document in 2 different databases will give you some performance benefits.

andyuk