tags:

views:

54

answers:

3

To back up a git repo, is there any reason why I couldn't just run a cron like this?:

/usr/bin/tar -Pzcf git_backup.tar.gz repo.git && /usr/bin/scp git_backup.tar.gz me@other-server:/home/backup

If something happened to all other copies I could use the most recent, just tar -xzf into it's original place, clone, push, pull etc? Seems like it should be okay but I'm not 100% sure. Note: I've seen other answers involving git clone or using --mirror but this seems more straightforward. Those are still options if the answers indicate it would be better.

---------------- EDIT -----------------

Here's the script I ended up creating:

#!/usr/bin/php -q
<?php

/**
 * Backup git on this box and copy it around
 *
 * cron:
 * 1 2 * * * /usr/bin/php /home/sysadmin/files/shared/git_backup.php REPO 2> /dev/null
 *
 * @package scripts
 * @author Hans Anderson <handerson@>
 */

list ( $dir, )  = explode ( '/files/', __FILE__ ); require_once ( "{$dir}/files/bootstrap.php" );
$email      = $cfg['GIT_BACKUP']['email_errors_to'];
$copy_hosts = explode(',', $cfg['GIT_BACKUP']['hosts']);

if ( !isset ($argv[1]) ) exit;

$repo = $argv[1];
$date = date ( 'Y-m-d-H' );
$user = `whoami`; $user = trim($user);

$repf = "/newdisk/git/{$repo}.git";
$bndl = "/newdisk/backup/{$repo}/git/git-{$repo}-{$date}.bndl";

chdir($repf);

$exec =  "/usr/bin/git bundle create $bndl --all";
exec ( "$exec", $error, $return );

if ( $return <> 0 ) // bad
{
    mail ( $email, "{$user} GIT Backup Failure [{$repo}]!", __FILE__ . "\nFailure to dump.\nCmd: $exec\nError: $error" );
}

foreach ( $copy_hosts as $host )
{
    $exec = "/usr/bin/scp -q {$bndl} sysadmin@{$host}:/home/sysadmin/data/backup/$repo/git";
    exec ( $exec, $error, $return );

    if ( $return <> 0 )
    {
            mail ( $email, "{$user} GIT Backup Failure [{$repo}]!", __FILE__ . "\nFailure to copy to dbs1.\nCmd: $exec\nError: " . implode ( "\n", $error ) . "\n\nReturn:" .  implode ( "\n", $return ) );
    }
}
+1  A: 

Only reason I can think of is that you will be creating a copy of all your binaries (and object files... etc...). You probably only want the repository, especially if you are going to backup remotely.

If it were me, I'd do the git-clone that way the backup is smaller, and moving it will be faster.

Git is designed to have distributed repos so that you don't have the SVN problem where if the central repo is messed up you have a headache to restore it (if possible at all). Just git-clone backups all over the place :-)

@user424448: Do you mean a 'git remote add alias.bak "push --mirror ..."' type thing or clone it on the backup server and pull? Thanks.
Hans
@Hans: Probably slightly easier with a `push --mirror`, but you could also fetch with the refspec set to refs/* instead of refs/heads/* - either way as long as you get everything.
Jefromi
A: 

Yes, this will work just fine. The only possible problem is if the cron job runs while changes are being made to the repo (e.g., via git push or commit). (Native git commands use lock files to make sure things are always in a sane state.)

In fact, a more efficient approach would be to use rsync so you only have to send the new stuff over the wire -- no need for the cost and space of making the tarball and fewer bits to send over the wire.

Either way, this approach has some benefits over using clone or mirror because config and meta files will also be backed up (e.g., .git/config, .git/info/exclude, .git/hooks/* and the reflog -- which is really useful).

Pat Notz
Good point ... I'm going to want to back up the hooks for sure. I guess I figured that would be part of the repository but I guess it's probably better this way.
Hans
+2  A: 
  • One major rule of referential backup: never backup something while it might still be changing.
  • One minor rule: try to get as few files as possible for a backup; their transfer anywhere else is then simplified (few files to copy).

The one command which can respect those two rules: git bundle (see also this SO answer)
With the added bonus of:

  • incremental backup (meaning the process is quicker than a full tar).
  • only one file as a result.

The unique resulting file (from a bundle) doesn't even need to be uncompressed to be reused. It is a Git repo in its own.

VonC