views:

408

answers:

2

Before I begin: I have spent a long time on many forums (including Stack Overflow - and yes there are a lot of SO questions on organizing svn), searching Google, and reading documents (I own a few Subversion books). I still have not found a good way to organize our code base in Subversion. We currently use RCS as our revision control system, and everything is stored in 1 RCS directory - Ugly, I know - that's why I am working towards something better. I have also used Subversion a lot, so I know it's capabilities and how it works. I have hesitated asking this question for months, since it is not completely programming related, but since I haven't been able to come to a solution, what better place to ask my question!

What complicates things in my head is the subversion term "Project". If I want to manage a java project in subversion, this makes perfect sense to me: all of the java files that get combined into a jar file could be considered a "Project" - they all belong together. However, in our environment, I do not see an easy way to define what a "project" is. We have over 4,000 programs, and all of them are pretty much independant of each other. Many of them are shell scripts or perl scripts. Some of our scripts use generic "utility" or "library" scripts, but for the most part, all code objects are indepenant.

One "Project" in our environment could involve program A,B, and C, and config file AA. Another project could use programs C,D, and E, and config file BB. Yet another project could just be changing config file AA, or maybe program B. There is not a way to classify which programs or files belong in a group. Because of this - I have no idea how to organize our code into subversion. I could put everything into a master project trunk, but then checking out a working copy means checking out all 4,000+ elements.

To give some context, this is for a Data Warehouse. All 4,000+ code elements are needed to make the warehouse function. Maybe a certain business requirement comes in that requires changes of a column that is accessed in a few of the elements, and another business requirement requires changes to a few other elements (maybe some of the same from the other project).

Maybe Subversion isn't the best fit for us, although I have to believe it can work. We already have a Subversion server for our web code and our Java programs, and it works great, because there are easily defined projects. I just can't figure out how to organize our main code library.

Hopefully some of that made sense... Thanks in advance for your wisdom!

A: 

I would try to organize the folder structure of the files before simply just dumping it into a subversion repository.

I think your problem mainly lies in the disorganization of the existing files. If you can find a way to logically divide up your system into segments, then it would be easier to allow people to only check out chunks of files(which would be in logical groupings).

Subversion really mirrors a filesystem, so if it doesn't look pretty in a filesystem, it won't look pretty in subversion either.

If you want to avoid reorganizing files, perhaps you can find a version control system that lets you check in/out things based on tags, instead of where they are in the file system.

Kekoa
I agree that our current library is disorganized. The only problem is there is no easy way to divide the files up. There are some divisions that could be made, but some of our project will end up working with maybe a single element from each groups, and then I think we're back in the same boat...
BrianH
Group by common closure. This has been well-described in Object Mentor's principles of Object-Oriented design. Things that change together need to be grouped togther.
tottinge
+1  A: 

You might look at the externals property. It allows you to define that checkout of directory to which is this property attached, will also checkout other locations within the repository to subdirectories of that directory.

So you might create "real" directory for each component and then create separate directory for each project which will use the externals for checkout of required components.

Komat
That does look somewhat promising. The only problem is that the same group of elements (grouped by an "externals" property) may never need to be worked on as a group again. Elements from that group may be worked on along with elements from another group. Maybe "one-time externals" are what we need, so we can ad-hoc check out a few elements as a "project", but then never use that "project" again...
BrianH
It depends on how much the one-time the thing is. If each developer needs its own unique combination each time, the externals are not good match. If more developers will use the same configuration, the directory can be created at start of the project and deleted when it is not needed anymore.
Komat
Most of the time, each developer will need their own unique combination every time... It is rare that the same group of elements would be needed over and over again.
BrianH
In that situation I would probably create a standalone program where the developer would select what he needs in way which fits your project (e.g. by choosing some tags). The program would then check-out all required parts of the repository.
Komat
Yep, I have thought of that idea too, but I am unaware of how to check out a single element from the repository - I think svn makes you check out a directory. If I could do single elements, I would probably go this way. Are single-element checkouts possible with svn?
BrianH
As far as I know the smallest thing you can check out from the svn are directories. While it might be possible to put each single file element into separate svn directory, extract the directories into some "commit" directory and then construct a working directory using hardlinks to files inside this "commit" directory, such layout appears to be somewhat clumsy. Maybe some asset management system with revision control might be better fit.
Komat