views:

2054

answers:

7

Subversion's "keyword" feature is great to automatically tag text files with the revision number. I'd really like to do a similar thing for Word and/or OpenOffice documents.

I tried this with Word documents, by inserting a "fixed-width" keyword substitution into the "comment" document properties field. But it seemed to still corrupt the document somehow (plus I don't know what "fixed-width" might mean in the case of multi-byte characters). I also didn't like this idea because it would not be good for inserting the number in the printable part of the document itself.

What I'm imagining now is a macro that automatically runs on document open, and updates a custom document property. The document could contain doc property reference fields that get updated with the value stored in the doc property.

Has anyone done this, or done anything else to achieve this goal? For either Word or OpenOffice?

+2  A: 

First: Embedded Version Numbers - Good or Evil?: I find them evil.

You should not use technical internal revision number to represent the version of a document.

"This is the 2.2 revision of my word document" is not the same than "this is revision 1567 of my word document".

  • The former is an "applicative" revision number, from the point of view of the final client.
  • The latter is a "technical" revision number, from the point of view of a tool.

Plus, if it modifies the document with the current revision number, it still need to be committed, meaning the stored version would be in a revision number different than the one updated by your macro.
If not committed, there is always the chance that the document being tagged like that is not exactly the one initially queried from Subversion.


That being said... on the more general issue of updating Office document properties:

That thread update word 2003 fields automatically suggests the use of an Office API. Microsoft.Office.Interop does not allow property modification, but the VBA API does allow you to access any CustomProperty you want to set for a given SmartTag.

This article "To add a smart tag with a custom recognizer to a Word document" gives you an example of a SmartTag custom behavior.

Smart tags are strings of text that have type information attached to them; when a text string that matches the criteria appears in a document, it is recognized and the user is able to perform actions appropriate for that type of string.

So one could imagine a SmartTag able to recognize the string "revision for this document", with the custom behavior being "I will query the right revision number to SVN and display it"

VonC
Thanks for the answer.As for embedded revision numbers being evil, am I correct in understanding that they're evil not inherently, but because of the SVN implementation difficulties (e.g. updating and merging)?You've convinced me that updating the value on doc-open is not the right time to update. The document might be separated from version control (e.g. e-mailed to someone else) in which case the value can't be updated properly. So it should really be updated at check-out from SVN.
Craig McQueen
See my comment on your answer.
VonC
+1  A: 

You could try Tobi's little script: http://insights.oetiker.ch/windows/SvnProperties4MSOffice/

Stefan
A: 

What if you save your Word document as .xml (so called Flat OPC format)?

Then it is just a text document, and svn keywords should just work.

plutext
It's certainly an option worth considering. But I really don't want my revision control to dictate my word processor choice. Revision control shouldn't be so obtrusive, surely.I use Subversion and MS Word at work (I've got no choice in either; I enjoy Subversion and... well I manage to get stuff done with Word).
Craig McQueen
A: 

VonC's answer has convinced me that doc-open isn't the right time to update the document property. For example, if the file is e-mailed to someone else, or copied a CD, or "exported" then it can't update its SVN revision when it's opened. It would be difficult to ensure the file could never contain an erroneous out-of-date revision number. So to do this "properly", the file's revision number should be updated at SVN checkout/update.

I believe for SVN keywords in text files, the client does the "fiddling" with the file, updating it on checkout and returning to "canonical" form before commit. So for Word, it would be great to use client-side hooks to do the same. TortoiseSVN has client-side hooks, but I don't think other SVN clients do. At my work, we almost always use TortoiseSVN so that could work nicely.

So what I would like to do is write two TortoiseSVN client-side hooks:

1) Post-update hook to insert a "SvnRevision" document property containing the file's relevant commit SVN revision.

2) Pre-commit hook to remove the "SvnRevision" document property. This makes the file stored in the repository "clean" in the event that a non-TortoiseSVN client checks it out. (It might also prevent merge conflicts?)


Update: Arrr I just realised another problem: if I do the above, then SVN will think that the file has changed. Hmm this seems difficult. For the feature to work properly, it really needs fairly tight integration into the client.

Craig McQueen
I believe in the "evilness" of meta-information here because that information (the SVN revision number) should not be relevant to your document. It is not about SVN per se, or about merging or updating. It is about the difference between an information part of the data (this is "2.0" of my document) and a technical meta-information (this is "rev. 1597" of my document ???). The latter *should not*, IMO, be displayed.
VonC
On the other hand, it's likely that SVN will end up storing multiple revisions with the same "document version number" (unless the committer updates it on every commit, which is almost certainly not going to happen). So if you have several revisions in SVN all saying "2.0", how do you know which is the "official" 2.0?
Craig McQueen
+1  A: 

We are actually using a "system" somewhat similar (created by a former collegue), that solves some these issues.

The desires...

  • We want to be able to see that two printouts are actually of the same version.
  • We want to be able to locate the source of a print out (including revision).
  • We want to elliminate problems originating from authors forgetting to update revision number.

The solution

We are not using the svn revision number, but instead use a "human" revision number for our documents. To make sure that we do not have multiple versions with identical revision number floating around the number is automatically updated whenever the document is modified, i.e. there will be several "revisions" that are never released... guess it's not perfect...

The semi technical details...

  • The "human" revision number is stored in a custom property of the document.
  • We have set all word documents to requre lock (svn:needs-lock)
  • On Lock a plus sign is added to the end of the revision string to indicate a "dirty" version.
  • On Commit the plus sign is removed and the number is incremented

Result

a system where we have a revision number in the printed document that:

  • do not change unless the document changes
  • indicates dirty versions, i.e. print outs from modified working copy
  • differs if the documents are different (i.e. no problem with authors forgetting to update the revision)
Lars Kristensen
A: 

Unfortunately, you can't avoid using MS-Word where you work. Currently, I'm trying to integrate svn and LaTeX [using the svn-multi package] to produce our quality documentation. I'm working with users to instill the idea that a commit is a "really big deal" and only happens after the Change Controls are all signed off.

Then, the process is commit, update, compile with pdflatex, dump .pdf docs to our document server. That way the .pdf docs and the repository have the same rev number.

svn-multi is nice in that the individual files retain their rev number when the overall document rev number may be higher [think revs of chapters in a book vs. the whole book].

Yuri gagarian
A: 

Stefan's answer is the one I was looking for, i.e. how to get the SVN revision into the doc. I use SVN to manage the many versions of documents that we generate. Excellent work Tobi!

Matt