Subversion 1.5 isn’t out yet — and it won’t be out for a while — but there’s a neat little new feature that has the potential to make repository administration for FSFS repositories a little bit easier, especially when it comes to backups.
The new feature is that…
svnadmin recover now does something for
If you haven’t come across it,
svnadmin recover is one of those little
oddities that was invented for Berkeley DB repositories1.
It literally is nothing more than a wrapper around BDB’s
functionality, which performs ‘normal’ recovery of the database after an
unclean shutdown: effectively nothing more than a journal replay. In the past
it was necessary when BDB databases got ‘wedged’, and it’s still needed if
you want to change some of the options in the
Subversion 1.1 introduced FSFS, the filesystem-based Subversion filesystem, and Subversion 1.2 made it the default option, so any repositories created with Subversion 1.2 or later will be in FSFS format. FSFS has some interesting properties, one of which being that the locking model is very simple: the filesystem is always in a consistent state for readers, so writers only block other writers. This property means that “recovery” in the BDB sense isn’t necessary for FSFS.
But recovery also has another use: it can be used to fix up some types
of missing data. In fact, the
svnadmin hotcopy mechanism uses Berkeley
DB’s catastrophic recovery to create a set of empty log files after
copying the database to a new location. And it’s in that context that
we’ve implemented recovery for FSFS.
So, another interesting property of FSFS is that all revision files are immutable. This makes operations people very happy, because they can just back up the whole repository using a simple incremental backup strategy… almost.
The sticking point is a little file called
db/current, which stores
a very small amount of information about the filesystem — the largest
revision number and the next unique node and copy ids.
Due to the way that FSFS makes sure that readers always see a consistent
current file is the last thing to be updated. This means
that the backup may not be consistent if you follow a naïve strategy
of just backing up the files in any old order — by the time
is backed up, it might be pointing to a revision that doesn’t exist in
your backup copy, or worse, one that has only been partly-written.
The easy solution to the ordering problem is to make sure to back up the
current file first — that way, you’ll always get a consistent copy
of the filesystem up to whichever revision was current at that point.
However, this can be quite awkward in some cases, especially if you have
a lot of repositories or an inflexible backup solution.
There’s another problem if you’re trying to implement a disaster recovery solution using a warm standby. Typically, you’ll do this by copying the revision files as they appear on the disk, either by using a post-commit trigger, or direct support from the storage device.
If you have a very high commit throughput, or very large files (or both), you might find that some revisions take a lot longer to copy than others, so you may decide that it’s be worthwhile to run the copies in parallel. Again, this might be something that your storage device supports natively.
But now you’ve got a really big problem: there’s no safe synchronisation
point at which it’s safe to copy
current. The best you can do is
copy it just before you kick off the copy for each revision, but now you
have all your parallel copying jobs serialising against the same file,
which is hardly ideal.
Anyway, I think you can see where this is going. As of Subversion
svnadmin recover will recreate the
db/current file in an FSFS
repository from the existing revision files, so if you fit into one of
the scenarios above, you won’t have to worry about your backups quite
so much — just run recovery after you restore.
Of course, if you can back up
db/current in the correct way, I’d
recommend you continue to do so. By definition, recovery is not a
fast process: it has to read through all the revision files in the
repository to find out what the next unique node and copy ids are,
and that can take quite a while.
Update: I had a chance to test the speed on the ASF’s repository — using my under-powered fileserver, recovery completed in just over an hour, for just over half a million revisions. I guess I over-estimated the amount of time it’d take.