views:

606

answers:

2

Hi,

Some of our projects are still on cvs. We currently use tar to backup the repository nightly.

Here's the question: best practice for backing up a cvs repository?

Context: We're combining a several servers across the country onto one central server. The combined repsitory size is 14gb. (yes this is high, most likely due to lots of binary files, many branches, and the age of the repositories).

A 'straight tar' of the cvs repository yields ~5gb .tar.gz file. Restoring files from 5gb tar files will be unwieldy. Plus we fill up tapes quickly.

How well does a full-and-incremental backup approach, i.e. weekly full backup, nightly incremental backups? What open source tools solve this problem well? (e.g. Amanda, Bacula).

thanks,

bill

+3  A: 

You can use rsync to create backup copy of your repo on another machine if you don't need history of backups. rsync works in incremental mode, so bandwidth will be consumed only for sending changed files.

I don't think that you need full history of backups as VCS provides its own history management and you need backups ONLY as failure-protection measure.

Moreover, if you worry about consistent state of backed up repository you MAY want to use filesystem snapshots, e.g. LVM can produce them on Linux. As far as I know, ZFS from Solaris also has snapshots feature.

You don't need snapshots if and only if you run backup procedure deeply at night when noone touches your repo and your VCS daemon is stopped during backup :-)

darkk
+1 for rsync which I use in combination with ZFS (on the rsync target) for backing up all my files. I'd say, versioned FS or not, try to ensure that nothing is accessing the repository while the sync runs.
Hanno Fietz
+1  A: 

As Darkk mentioned rsync makes for good backups since only charged things are copied. Dirvish is a nice backup system based on rsync. Backups run quickly. Restores are extremely simple since all you have to do is copy things. Multiple versions of the backups are store efficiently stored.

Zoredache