tags:

views:

46

answers:

2

A couple of months ago I added and committed a release tarball to a git code repository. A couple of commits later, I removed the file and committed the removal. This one file was nearly 10x the size of the whole repository, so the presence of that file in .git slows cloning down significantly. At this point there have been hundreds of commits since the pair of commits that added and removed the file.

Is there a way to remove the two commits which cancel out (the add and the remove) and also remove the copy of the file in .git, without hosing the repository?

Thanks..

+1  A: 

You can do a git rebase and write the commits out of the history--- this will change the ids of all the commits after that, though. There's no way round that, it's by design.

So say the commit where you added the tarball was master~100. If you do a git rebase -i master~101 on master, then you'll be presented with the list of commits to peruse and edit. You can simply take out the commits that added it and removed it, or alternatively move the cancelling commit to be next to the first one and mark it as "squash": that will make git combine the two, and it should figure out that the result is null (but will cope in case the two don't precisely cancel, so this approach is a bit safer).

araqnid
One additional remark: this will make things hard for peer developers if you push the rebased repository elsewhere. But as araqnid said, there's no way around it.
Bram Schoenmakers
To put it another way: this rewrites the history. Rewriting Git history, just like real-life history, requires a global conspiracy: everybody who has ever been in contact with that history needs to conspire together to get rid of it. Everybody who pulled from that repository, everybody who pulled from somebody who pulled from that repository and so on. They all must rewrite *their* history, merges will no longer work, etc-
Jörg W Mittag
A: 

Take a look at git filter-branch. This is what the best tool for rewriting history. The man page has a few good examples.

If others use the repository, ensure that they realize they will get a master branch that excludes their latest commits. They should now branch to something like temp and do a git fetch.

Now they can use Gitk --all to see exactly where the divergence was and rebase to get back in sync with the altered SHA-1s of the commits that were subsequent to the commit that used to have the giant file.

HTH,

Adam

adymitruk