by Malcolm Rowe

mod_dav_svn improvements: SVNActivitiesDB

Submerged, the CollabNet Subversion blog, just posted a nice article describing the new merge-tracking features that will be included in Subversion 1.5, so I thought I’d post1 some more about one of the ‘new in 1.5’[not-yet-out] features (and there’s a lot to choose from!).

I’ve already mentioned some improvements we’ve made to FSFS (and a little bit about svnadmin), but those are both behind-the-scenes changes, so this time I thought I’d cover one of the new Apache configuration directives for mod_dav_svn — one that allows administrators to control the location of the activities database.

mod_dav_svn provides a bridge between the WebDAV world — which works in terms of ‘activities’ and ‘version resources’ — and the Subversion world — which works with ‘transactions’ and ‘nodes’. One of the many differences between these worlds is the way an activity (or transaction) is named: in WebDAV, the client names the activity, while in Subversion, the server names the transaction2.

So one of the things mod_dav_svn needs to track is the relationship between WebDAV activity names and the corresponding Subversion transaction. Before 1.5, the repository contained a file called dav/activities which stored this mapping, implemented using APR-util’s simple database format. One problem with this approach was that the precise format would be whatever one of many supported dbm formats was chosen as the default when APR-util was compiled. This made it hard to reason about the safety of storing a Subversion repository on NFS, for example.

For 1.5, we switched to a simpler scheme where we create one file for each activity (storing it under the directory dav/activities.d/); each file contains the name of the corresponding Subversion transaction. This is much simpler, and shouldn’t cause any scaling problems — there are typically only a small number of transactions open at any one time.

So, to finally get to the point, one of the new configuration directives is SVNActivitiesDB. It allows the administrator to override the location of this activities ‘database’.

This directive behaves differently depending on whether repositories are being served with an SVNPath or SVNParentPath directive. If the relevant section of the configuration file looks like this:

SVNPath /home/svn/myrepo
SVNActivitiesDB /tmp/activities

then /tmp/activities/ will be the directory used to store the files mapping activities to transactions for the repository at /home/svn/myrepo/. On the other hand, if the configuration file looks like the following (where the repository name is inferred from the client’s URL):

SVNParentPath /home/svn
SVNActivitiesDB /tmp/activities

then the path supplied to SVNActivitiesDB will be used to store the activities databases for all repositories available through the parent path, and so the database for the myrepo repository will be at /tmp/activities/myrepo/.

Why would you want to move the activity database away from the main repository? The main reason is in cases where the repository is stored on a network filesystem, where you ideally want to be able to move all the transient and non-shared file operations onto local storage. While the activities database isn’t updated that much compared to the transaction data itself, it’s a start (and we’d like to be able to move the transaction data to local storage as well; we just haven’t done it yet :-)).

Of course, if you have more than one Apache server active at the same time, you’ll also need to ensure some form of session or host affinity so that the client returns to the same server that originally established the activity ↦ transaction mapping.

There are more new features in mod_dav_svn than just this, but they’ll have to wait until later.

  1. A more sensible post, as promised :-)

  2. Subversion transaction naming is actually delegated to the filesystem: in FSFS, for example, transactions are named “base rev-unique”.