views:

80

answers:

2

If I have a PHP application which allows users to make changes to documents, what is the best way to implement revision tracking for each document? I want the storage of each revision to be deltified (i.e. only save the changes that were made) like svn and other SCMs do with code. I know on a very simple level how it works, but when I start to think about implementing it, I get a little confused.

First and foremost, I am wondering if there is a library out there that can help me with this, so I don't have to completely roll my own.

And I am wondering: should I keep the full text of only the original document, and then only save the changes, or should I keep the full text of the latest document, and each time it is modified, save the differences as one of the older revisions?

If the former, then when I want to grab a page to be shown on the site, do I have to start at the beginning, and then recursively update the data based on the revisions, until I reach the current version? Won't this be painfully slow once there are many revisions?

How can I do diff/patch type operations in PHP to make the deltifying and reconstructing of the pages easier?

Would it be worth it to have locks on the pages when they're editing them? Or let pages get into 'states of conflict' and have conflict resolution operations -- let two users modify the same page simultaneously if they're modifying different parts, etc -- I'm going crazy thinking about how hard this will be. Ahh!

+2  A: 

This previous SO question might help.

Amber
thanks, this will help with the diff, but what about the sematics of storing revisions?
Carson Myers
I'd highly recommend storing the latest version in whole, and then storing the history as diffs. Otherwise you'll run into a huge performance bottleneck serving the current version (which will likely be most of your requests).
Amber
Also, as far as conflict resolution goes - I'd go with the same system Mediawiki uses. Keep track of when the user started editing a page, and when the document was last edited. If the last edit time is later than the user started editing, when the user goes to save show them a conflict page that has both the new version and their edited version, and require them to either cancel the save, or edit the new version to add in their changes. It's both fairly simple to implement and fairly easier for the user to work with.
Amber
A: 

Why don't you use a subversion server? You can access the client from the console using exec() or similar. It is really not worth implementing something like that from scratch unless this you are writing a revisioning software.

soulmerge
it'll be like what wikipedia and other wiki sites do. I believe there's a way to do it in wordpress as well, it's for a CMS. I want to have the whole system exist on one server, and have as few dependencies as possible since it'll be distributed
Carson Myers