tags:

views:

999

answers:

4

What is the best way to create a local backup of a git repository hosted on github?

I have the following needs:

  • The local backup should be a bare repo
  • The backup should include all branches
  • It should be easy to (incrementally) update the backup

Basically, I want a perfect mirror, with the possibility to update easily.

As such, the command 'git clone --mirror git://github.com/...' comes to mind, but as far as I can tell, that doesn't allow for an easy update (I'd have to delete and recreate my local backup). Also, the mirror option for git clone seems quite recent, I don't have it on some of the systems I'm working on (which have slightly older versions of git running).

What is your recommended solution for this kind of problem?

+3  A: 

I am not sure it could cover all your requirements, but you could check out git bundle

git bundle

This command provides support for git fetch and git pull to operate by packaging objects and references in an archive at the originating machine, then importing those into another repository using git fetch and git pull after moving the archive by some means

What I like about that solution is the single file produced, with exactly what I want in it

git bundle will only package references that are shown by git-show-ref: this includes heads, tags, and remote heads.

machineA$ git bundle create file.bundle master


Note: Kent Fredric mentions in the comments a subtlety from git rev-list:

--all

Pretend as if all the refs in $GIT_DIR/refs/ are listed on the command line as <commit>.

He adds:

your current bundle will only bundle parents of the commit, you'd probably need to specify --all to get a complete bundle of everything (branches that are descendant of master).

To see the difference:

$ git bundle create /tmp/foo master
$ git bundle create /tmp/foo-all --all
$ git bundle list-heads /tmp/foo
$ git bundle list-heads /tmp/foo-all
VonC
nb: your current bundle will only bundle parents of the commit, you'd probably need to specify --all to get a complete bundle of *everything* ( branches that are descendant of master ). git bundle create /tmp/foo master ; git bundle create /tmp/foo-all --all ; git bundle list-heads /tmp/foo ; git bundle list-heads /tmp/foo-all . Small, but significant.
Kent Fredric
+3  A: 

but as far as I can tell, that doesn't allow for an easy update (I'd have to delete and recreate my local backup).

Not sure what you mean by that, updating it should be as simple as

git fetch

git clone as it is is supposed to fetch all refs/commits that are visible on the remote branch.

git clone --mirror is also not very different to git clone --bare [source]

the only relevant difference is the shorthanded git remote add --mirror

( See git help add for the different behaviour )

If you're really worried, you can do this:

git clone --no-hardlinks --mirror $original $dest

Which will only do anything different if they were on the same filesystem anyway.

And if you're really paranoid, you can tar.(gz|bz2) the whole directory and back that up.

Kent Fredric
A: 

What are you are asking is quite difficult to do within the constraints of git. The problem is that neither cloning nor fetching will give you all the branches by default. See this question:

For an example of cloning a repo with multiple branches, here is a transcript:

% git clone -o tufts linux.cs.tufts.edu:/r/ghc/git/experimental.git
Initialized empty Git repository in /usr/local/nr/git/ghc/experimental/.git/
% cd experimental/
% git fetch
% git branch -a
* head
  tufts/HEAD
  tufts/experimental
  tufts/head
  tufts/norman
% git branch --track experimental tufts/experimental
Branch experimental set up to track remote branch refs/remotes/tufts/experimental.
% git branch --track norman tufts/norman
   ...

You can see that cloning each branch programmatically is going to be a little tricky.

If github provides access to rsync or Unison these are better tools for the job. Otherwise, you'll have to write some scary scripts...

Norman Ramsey
Note that it's easy to automate the tracking. Suppose you want to track all branches from origin. In bash: `git branch -r | grep "^ *origin[^ ]*$" | while read remote_branch; do branch=${remote_branch#*/}; git branch --track $branch $remote_branch; done`
Jefromi
@Jefromi : thats essentially what git remote add --mirror does :)
Kent Fredric
A: 

I wrote a ruby script with the help of some others:

http://github.com/walterjwhite/project.configuration/blob/master/scripts/github.com.backup.ruby

That script allows me to download all of my repositories. I use it to periodically make a backup of the projects I am working on.

I hope this helps, feel free to tweak it. I think it has a bug, occasionally, github will timeout and the script doesn't handle that.

Walter

Also, I'd like to submit that back to github as a feature enhancement. It'd be nice to have a download button where you can download all of your repositories easily. I need some more support to make that happen.