metadata for subprojects within an svn repository

Basically I'm looking for a way to find things more easily within a large/complex SVN repository.

(I frequently work on small tool applications, and will be shortly moving them into a common SVN repository. So I'm thinking of the case where may be dozens or even hundreds of little tool applications in one place. I've got a dozen or two and already I lose track sometimes of where I used some particular feature or library or build technique, or even which tool does what.)

Has anyone made a lot of use with metadata in an svn repository? What's worked and what hasn't?

I'm talking not only of how to store metadata, but what you do with it, like generating an HTML index. For storage, the way I see it there are 3 basic possibilities:

put your metadata in a plain file that is checked into the svn repository. (e.g. some xml file with a special file convention e.g. svn-metadata.xml) This will then be versioned, but it makes it independent of svn.
use svn properties to store metadata. (works ok, and is versioned, but then you are tied to svn. Plus side is that you can tie metadata specifically to individual files.)
store metadata in an external location like a database or a wiki. (integrates more directly that storage location's features, but wouldn't be versioned, and is tied to that kind of storage.)

I'm thinking of maybe using RDF + RSS as metadata in a plain file, and then writing something that periodically scans the SVN repositories for metadata, indexes it in a database, and generates an easy-to-use web app to make it easier to find.

Actually I would mix both metadata in svn properties and (versioned) plain xml-like files.

1) Everything that is related to the server can conveniently be stored in svn properties, if you need that which might not be the case here. I mean properties to do something special about a file or a directory when you proceed to commits, checkouts/exports, ... For example if you want to use the hook scripts to update some external documentation each time you touch a particular file.

Using hook scripts like that to maintain separate information up-to-date usually avoids more time-consuming procedures that scan the whole database, it is less heavier for the server.

2) Scripts to process your database would have a better place in repository files (xml, or whatever you feel most at ease with). A typical example is a script that compiles everything or part of your tools and builds an installer, storing specifics about your tools in an easily readable/manageable file makes sense. And as you pointed out, it has to be server-independent as much as possible (yet you could have some links like including the revision in the final application to keep track of their versions).

It is how I proceed now, and it works well (not much elaborated on the hook scripts yet though). It helped to separate both.

I'm just not convinced yet what the best language would be to implement hook scripts. Python (with pysvn) is great but forces a reload of the interpreter each time and is dynamically typed - haven't checked the impact. I couldn't find yet any reliable API for C# that would also work on Linux with Mono, maybe C or C++. It mostly depends what has to be done I suppose.

ansaurus

tags:

views:

answers:

metadata for subprojects within an svn repository

related questions