tags:

views:

58

answers:

3

man git-gc doesn't have an obvious answer in it, and I haven't had any luck with Google either (although I might have just been using the wrong search terms).

I understand that you should occasionally run git gc on a local repository to prune dangling objects and compress history, among other things -- but is a shared bare repository susceptible to these same issues?

If it matters, our workflow is multiple developers pulling from and pushing to a bare repository on a shared network drive. The "central" repository was created with git init --bare --shared.

+1  A: 

I do not know 100% about the logic of gc.. but to reason this out:

git gc removed extra history junk, compresses extra history, etc. It does nothing with your local copies of files.

The only difference between a bare and normal repo is if you have local copies of files.

So, I think it stands to reason that YES, you should run git gc on a bare repo.

I have never personally ran it, but my repo is pretty small and is still fast.

bwawok
+1  A: 

Some operations run git gc --auto automatically, so there should never be the need to run git gc, git should take care of this by itself.

Contrary to what bwawok said, there actually is (or might be) a difference between your local repo and that bare one: What operations you do with it. For example dangling objects can be created by rebasing, but it may be possible that you never rebase the bare repo, so maybe you don't ever need to remove them (because there are never any). And thus you may not need to use git gc that often. But then again, like I said, git should take care of this automatically.

svick
+2  A: 

From the git-gc man page:

Users are encouraged to run this task on a regular basis within each repository to maintain good disk space utilization and good operating performance.

Emphasis mine. Bare repositories are repositories too!

Further explanation: one of the housekeeping tasks that git-gc performs is packing and repacking of loose objects. Even if you never have any dangling objects in your bare repository, you will -- over time -- accumulate lots of loose objects. These loose objects should periodically get packed, for efficiency. Similarly, if a large number of packs accumulate, they should periodically get repacked into larger (fewer) packs.

Dan Moulding
+1 Thanks for clarifying one of the reasons that gc might be necessary on a bare repo.
Mark Rushakoff
It's definitely true that `gc` needs to be run on all repos, bare or not. It's also true that enough commands run it automatically that you essentially never have to. In the case of a bare repo, it's `receive-pack` that invokes `gc --auto`. (Sometimes you may want to manually run `git gc --aggressive`, which will "more aggressively optimize the repository at the expense of taking much more time", but you may not find that to be important.)
Jefromi
@Jefromi: I agree. The problem is that it doesn't seem to be very well documented which commands run `git gc --auto`. I checked the `git-receive-pack` man page before writing my answer, and there's no mention of it there. So for the average user, I think it's difficult to know if `git gc` needs to be manually run. The fact that the `git gc` man page still recommends that user's *do* run it manually seems to only add more confusion! Perhaps this is something that should be mentioned on the mailing list.
Dan Moulding
@Dan: Yeah, git's documentation unfortunately can be a bit spotty sometimes. Maybe if I get ambitious I'll submit a patch. From a quick survey of the source: `merge`, `receive-pack`, `am`, `rebase --interactive`, and `svn` call `gc --auto` directly. That's not a complete list, though, since other commands may call those.
Jefromi