views:

502

answers:

3

Hi,

is it possible to use Mercurial version control to track Word or PDF files? Is there any limitation or problem?

THANKS!

+4  A: 

Yes, but of course you won't be able to diff in any meaningful way. The files will therefore be treated as binary during merges.

Mercurial is perfectly capable of tracking binary files:

Mercurial generally makes no assumptions about file contents. Thus, most things in Mercurial work fine with any type of file.

Mercurial stores a binary diff regardless of the file type. The problem with PDF/Word files is that a little change to them usually causes a huge difference in their binary representation on disk. .docx Documents are stored as a zipped xml, due to the zipping a single flipped bit inside the archive can cause the zip archive to look completely different.

If you don't grow your repository too large, you probably won't experience any issues using Mercurial.

Johannes Rudolph
except if the binary is such that small changes to the document make large binary changes then the binary diff will be about the same size as the file for each version. i'd use words xml format rather than the binary ones
jk
@jk: You're right, in fact I assumed he used the office 2007 format, only problem is it is zipped internally AFAIK. But people tend to blame the VCS for not handling binaries correct all too often.
Johannes Rudolph
and if I save the word file in XML with "Save as"... ?
andrew007
then you will get efficient storage in hg, and could conceivably use normal text merge to merge docs, there do appear to be special word merge tools available which may (or may not) be better though
jk
+1  A: 

If you are willing to use Subversion instead of Hg, you could use the OooSvn extension:

which handles nicely the problem described above, i.e., it manages real diffs of ASCII files instead of binary (Of course we wouldn't have the problem with LaTeX ;-) ok I stop..).

There might also be a way either to use OooSvn in conjunction with the Svn -> Hg gateway, or to adapt its code in order to use Hg instead: if they are launching svn commands, it should even be quite straightforward..

Hope it'll help.

Cheers,
Christophe.

Christophe Muller
A: 

I might be a little late here but I notice that you can use git very effectively to version control docx files. I've tried it on my Windows 7 machine. The diff will open a compare in MS Word itself for easy viewing. I used TortoiseGit.

Ashish