tags:

views:

264

answers:

1

Is there a way in mercurial to remove old changesets from a database? I have a repository that is 60GB and that makes it pretty painful to do a clone. I would like to trim off everything before a certain date and put the huge database away to collect dust.

+4  A: 

You can do it, but in doing so you invalidate all the clones out there, so it's generally not wise to do unless you're working entirely alone.

Every changeset in mercurial is uniquely identified by a hashcode, which is a combination of (among other things) the source code changes, metadata, and the hashes of its one or two parents. Those parents need to exist in the repo all the way back to the start of the project. (Not having that restriction would be having shallow-clones, which aren't available (yet)).

If you're okay with changing the hashes of the newer changesets (which again breaks all the clones out there in the wild) you can do so with the commands;

hg export -o 'changeset-%R.patch' 400:tip   # changesets 400 through the end for example
cd /elsewhere
hg init newrepo
cd newrepo
hg import /path/to/the/patches/*.patch

You'll probably have to do a little work to handle merge changesets, but that's the general idea.

One could also do it using hg convert with type hg as both the source and the destination types, and using a splicemap, but that's probably more involved yet.

The larger question is, how do you type up 60GB of source code, or were you adding generated files against all advice. :)

Ry4an
Much more detailed response than my (deleted) answer. +1
VonC
I am importing from another source control system a project that includes generated binaries. We're trying to get the dlls out of source control right now. Thanks for the help!
Jake Pearson
The answer contains a typo. The filename should contain lowercase %r (zero-padded numbers) otherwise the files won't get processed in the right order when you import them.
Gili