views:

119

answers:

4

I'm involved in a project that, among other things, involves storing edits and changes to a large hierarchical document (HTML-formatted text). We want to include versioning of textual changes and of structural changes.

Currently we're maintaining the tree of document sections in a relational database, but as we start working on how to manage versioning of structural changes, it's clear that we're in danger of having to write a lot of the functionality that a version control system provides.

We don't want to reinvent the wheel. Is it possible that we could use an existing version control system as the data store, at least for the document itself? Presumably we could do so by writing out new versions to the filesystem, and keeping that directory under version control (and programmatically doing commits and so forth) but it would be better if we could directly interact with the repository via code.

The VCS that we are most familiar with is Subversion, but I'm not thrilled with how Subversion represents changes to the directory structure -- it would be nice if we could see that a particular revision included moving a section from Chapter 2 to Chapter 6, rather than just seeing a new version of the tree. This sounds more like the way a system like Mercurial handles changes to the structure.

Any advice? Do VCS's have public APIs and so forth? The project is in Java (with Spring) if it matters.

+2  A: 

You can certainly program SCMs via APIs. Check out SVNKit for Java and Subversion, or JGit for Java and Git. Mercurial doesn't appear to offer such an API.

Whatever you do, wrap up your implementation in a suitable API, so you can swap one SCM for another, or maybe bin the concept of an SCM at some stage in the future. It may well be a pragmatic solution to your problem, however, and worthy of more investigation.

Brian Agnew
Git has very nice built-in content-reordering support (since it is content-addressable). If the pieces of the hierarchical document are stored as "files in folders", then a Git back-end would detect moves as moves instead of as delete + insert. SVN is very bad at this.
tucuxi
I prefer git over svn, but I'd be careful with JGit, its lacking a lot of the git functionalities at this stage. SVNKit would be more mature an API.
mlaverd
A: 

Try http://svnkit.com/ for Subversion.

Nissan Fan
note licensing restrictions.... http://svnkit.com/licensing.html
Jason S
A: 

Here you have a pure Java SVN lib SVNkit it can be used by Eclipse SVN integration so it should be fairly stable.

Rickard von Essen
note licensing restrictions.... http://svnkit.com/licensing.html
Jason S
+6  A: 

Maybe you could use a JCR (JSR-170) compliant repository like Jackrabbit instead. To me, what you're describing is exactly what JCR is for. Have a look at this article.

Pascal Thivent
This looks very interesting and I'm going to look at it in depth. I'm not sure, though, how well it handles versioning of the structure -- it looks like versioning is primarily at the level of a node based on the small amount of discussion I was able to find.
JacobM