Backing up a Subversion repository using svnsync

Why sync an SVN repository?

In this article, I will discuss how to back up a remote Subversion repository to a local one. Use case is a (dedicated) backup server that retrieves a full backup copy of a repository at scheduled times, and the backup server initiates the backup autonomously.

Simply checking out the code on a remote machine does not give you the full history – only the last version of a file is stored upon checkout using Subversion.

This can be done by rsyncing or rdiff-backup’ing the SVN server’s repository directories of course, but with this method you can do it simply by having only Subversion access to a any network reachable repository.

On automatically syncing to a remote backup server on-commit

Another way which gives you instant backups is to run svnsync from the source server in a commit hook.

There are a few drawbacks to this: you need administrative access to the source repository to install a commit hook, you need to make your backup server’s repository available over http or ssh from the source server, and you may encounter some issues committing in case the sync fails (server down?).

I have not personally tested this, you can likely find some useful hints on how to do this on Not Really a blog.

Prerequisites

All we need for this is the subversion package on the backup server – instruction for Debian and derivatives:

apt-get install subversion

I’ll assume you have a working Subversion repository reachable over the network already.

Initializing the back-up repository

On your backup server, create a new repository:

svnadmin create /home/subversion/backups/svn.example.com

We need to enable the possibility to do revprop changes on this repository, so we can correctly sync all revision attributes (committer, time stamp, etc), which are usually read only, from the remote server. Create a pre-commit hook in /home/subversion/backups/svn.example.com/hooks/pre-revprop-change:

#!/bin/sh
USER="$3"

if [ "$USER" = "svnsync" ]; then exit 0; fi

echo "Only the svnsync user can change revprops" >&2
exit 1

This script will actually allow revprop changes; by default if no hook is installed, all revprop changes will be denied.

The ‘svnsync’ user mentioned in the script is the one used below when running the synchronization. It doesn’t need to exist on the remote repository. Make sure to chmod +x this file, or you’ll break your repository.

Preparing the repository to be backed up

As mentioned in the intro, we don’t need to do anything on the backup source. However, it may be advisable to create a separate user for the synchronization. I see 2 valid reasons for this: first, you can make it a read-only user, while your regular user possibly has read-write access, and second, as you’re likely creating a cron job for this sync, you’ll need to hardcode the password somewhere, so it might be best not to use the one from your regular Subversion account.

Running the synchronization

To do the initial sync, run the following command – the last parameter is the remote SVN URL:

svnsync init --source-username  --source-password  --sync-username svnsync file:///home/subversion/backups/svn.example.com http://svn.example.com/project/trunk

Yes, the destination of the sync comes before the source – don’t swap them! There is no : between the server hostname and the remote path. You could also use other Subversion access mechanisms than http, i.e. svn+ssh://, file:// or similar.
The sync-username parameter should be the user you granted access to the revprops above.

Now, perform a commit on the source repository (or wait for someone else to commit something), then run the synchronization command:

svnsync synchronize --source-username  --source-password  --sync-username svnsync file:///home/subversion/backups/svn.example.com

The change should be synced to the backup repository. As we already initialized the sync, we don’t need to specify the source repository any more.

If this works to your satisfaction, you can add that full command into cron, run crontab -e:

0 * * * * /usr/bin/svnsync synchronize --source-username  --source-password  --sync-username svnsync file:///home/subversion/backups/svn.example.com

The above snippet will run the synchronization every hour.

Conclusion

The backup server now automatically fetches a copy of the Subversion repository on the remote server. In case it blows up or mistakes are made, the source code including all history is safely stored off-site.

Alternatives

Apart from the rsync/rdiff-backup method mentioned in the introduction, to sync the entire repository directory, you could use “svnadmin dump” on the source, and “svnadmin load” on the destination. This means you have to copy the dump created by the former command to the backup machine first, necessitating some SSH keys or similar, if you want to do it from a cron job.

Writing informative technical how-to documentation takes time, dedication and knowledge. Should my blog series have helped you in getting things working the way you want them to, or configure certain software step by step, feel free to tip me via PayPal (paypal@powersource.cx) or the Flattr button. Thanks!