tags:

views:

386

answers:

3

I'm seeing that mercurial efficiently compresses the files in repository

(repo/.hg/store/data)

Does anybody know what kind of compression is used for repository files?

Thanks.

+1  A: 

Initial versions of files are compressed using deflate (same algorithm as zip), but for updated files, Mercurial stores only a (binary) diff against a previous version.

It also tries to do the right thing: When a deflated JPEG turns out bigger than the original, it will not store it "compressed", for example.

Thilo
It actually does a little more than that. If you *only* store deltas, the time to regenerate a stored file from the revlogs grows as a function of the number of changesets (i.e. this an O(N) algorithm). To bound that process Mercurial periodically stores the entire file *again*, and relies on the zlib compression to squash that back down to a reasonable size.
quark
+4  A: 

There are two levels of compression in Mercurial repositories: delta storage, and zlib compression.

In addition, various other parts employ also compression. For example, bundles can be compressed with both gzip and bzip2, as can archive tarballs - but I don't think you were asking for these.

Martin v. Löwis
+2  A: 

You might find Matt's paper on the revlog format interesting: http://mercurial.selenic.com/wiki/Presentations?action=AttachFile&do=view&target=ols-mercurial-paper.pdf

Steve Losh