tags:

views:

1691

answers:

4

I have a very large (~6GB) SVN repository, for which I've written a batch file script to do an incremental backup each day. The script checks when the last backup was run and dumps only the revisions since then.

The files are named: backup-{lower_revision}-{higher_revision}.svn eg: backup-156-162.svn, backup-163-170.svn.

This means that I have quite a lot of small dump files, which I guess is fine (better than a lot of 6GB dump files), but I'm a little bit worried about how much work it would be to restore from these backups should I need to.

To reduce the total number of files, I've taken to doing a full dump on the first of each month, but still, should I need to restore on the 30th, that's gonna be 30 dump files which could take a while.

What I have been considering is:

  • Manual:
    svnadmin load c:\myRepo < backup-1-10.svn
    wait
    svnadmin load c:\myRepo < backup-11-24.svn
    wait
    etc...
  • Batch file to make the above process a bit less tedious
  • Appending each of the files together and doing one load (if that is even possible?)

What would be the best way to work with these files, should I need to restore?

ps: the OS is Windows

+1  A: 

Regardless of the solution you come up with, I would definitely recommend doing a trial restore. That way you can verify that the process does what you really want it to, and that you will be able to complete it successfully when you need to use it in anger.

I would try your process as you have it right now, and if the process is tolerable as is, then simpler is better and don't mess with it. If it seems like a lot of work, then by all means look for optimisation opportunities.

Greg Hewgill
well the problem is that I don't actually have a process yet - I'm trying to start with the best :)
nickf
Right, you describe how your backup files are created (and that you have a lot of them), so you must half have the process working.
Greg Hewgill
+2  A: 

You should rename your files just by numbering the day [01, 02,..31] so your files can easily be sorted For the dump it is not important to know which revisions are inside.

I follow a different approach, because loading back a huge Repo like this takes some time, so you should consider the following:
You can use svnadmin hotcopy for hotcopying the repository every week/every month. Each day you should make an incremental dump for getting the latest revisions for retrieving the latest revisions you just have to call
svnadmin youngest [live_repo] -> gives you your most current revision of your live repository

svnadmin youngest [copied_repo] -> gives you the last revision you backed up by weekly hotcopy

now you can run a dump from from your live-repo using both revisions numbers.
Advantages:

  • much faster to get your backed up repository up and running again(dumping takes hours!)
  • less dump files
  • less scripting effort
  • extendable to "per-commit"-backups via post-commit-hook, so you will nerver lose any revision
Peter Parker
ahh that's not bad!
nickf
A: 

I would suggest running a dump command every day and just keep the last 5 dumps. That's 30 gigs for you.

Here's a script to run automated dumps that I use but I do manually delete backups:

::This script backs up the subversion repository.

::Reset Temp backup storage

rmdir /S /Q C:\SVNBACKUP
mkdir C:\SVNBACKUP

::Initiate SVN backup. Use svadmin hotcopy
svnadmin dump /svn/myProj1 > /home/username/myProj1Bak

for /f "tokens=2-4 delims=/ " %%g in ('date /t') do (
  set mm=%%g
  set dd=%%h
  set yy=%%i
)

if exist "\\networkdrive\Clients\SVN\%mm%-%dd%-%yy%" (
  rd /S /Q "\\networkdrive\Clients\SVN\%mm%-%dd%-%yy%"
)

xcopy "/home/username/myProj1Bak" "\\networkdrive\Clients\SVN\%mm%-%dd%-%yy%" /s /i
A: 

Peter's commands are actually: *svn look youngest [live_repo]* and *svn look youngest [copied_repo]*

PuO2