views:

360

answers:

8

Using SVN or Git the thought had occurred to me to use the directory structure as a meta data place holder. Each team would have a directory in a shared repository for their code. The more I think about it though the worse the idea seems. The reason it appears like a bad idea to me is that teams are not static entities. Teams can vanish, split up, recombine, and even get a new name. I know I am looking at using a unique identifier and probably an external database. I am probably even facing a per team repository to manage access rights appropriately. In a situation with 200 or more teams, am I best off maintaining ownership in an external database or are there some tools and practices I could learn from?

+2  A: 

Here's how I've seen it done. Align the directory structure along related technology lines. This is probably team-based right now. Use a database or something to map binaries (or source files) to team members. Then when things move around, all you have to do is update the database.

If you make the root pretty broad, you'll have less reorganization to do later than if you make it small.

Steve Rowe
A: 

Reading your question I have had an idea, that can be useful for you (or not, it's just a flash idea that has appeared in my mind suddenly ;-):

  1. As with the linux 'devtodo' tool, you can create a file in each dir for the metadata. (For devtodo, the file is called '.todo')
  2. That metadata file could be a nosql database file, with the members of the team at that moment.
  3. That metadata file could be in control version.
  4. That metadata file could be joined (as the sql join operation) with other files, i.e. all members in your teams.
  5. A set of scripts or shell aliases may be defined to write the nosql command line more frequently used.

In few lines: relation database files in your git or svn with your metadata; and no database servers.

Banengusk
Having data on a file share rather than a database makes it hard to read and update... I don't see that idea scalling well.
ojblass
Right, but it may be in git/svn...
Banengusk
The solution becomes unmanagable because different versions of the programs belong to different teams and that is likely not what you want. I think that the ownership must be outside the repository.
ojblass
You just answered your own question with that comment. Using directories as meta-data does the same thing, so if that's a problem, the only thing you can do is save it externally
Sander Rijken
A: 

The metadata which you want to store would be to save information regarding the files and the team members which worked with the resources. Assuming that you have mature semantics of resource handling in your repository, like check-in comments etc, one strategy could be to identify the metadata you are looking at and index the same using some indexing tool/api. There is a load of information which is available from the repository metadata which can be retrieved and indexed. If you need finer control, you can use customized pre-commit hooks to force team members to place required information while checking in. The metadata repository which you would have would give good control for querying/searching for users and resources. The indexing can be performed by a script on a daily/weekly basis.

+2  A: 

Git doesn't do access rights very well. To even start to handle it, you would need to use gitosis and then write some code into the git hooks. I had a question regarding this.

Then you would need, as you said, a database to handle information on which user can access which repository. I.e. you would need to split the stuff into separate repositories in git since you can't protect single files or directories in git. I wrote a short blog note about this some time ago. Superprojects might help in the future but this concept is still under construction and not very well defined.

In fancier (and expensive) SCM tools like Rational ClearCase you have better tools for this. You can for instance create components and define user rights per component. Unfortunately even these tools require extensive updates by hand if things change a lot. But if user rights is all you need, it would be much more cost-effective to write your own tools rather than buy one of these monsters.

Makis
+4  A: 

Here's my best attempt at pointing you in the right directions. Generally speaking:

Organize your source code according to the components of your system in the usual way. Make sure that each component can have access control configured independently (e.g. in git, this means separate repositories for each component, as Makis said).

Managing access rights is commonly done in conjunction with LDAP these days. Configure your tool to leverage your organization's LDAP directory in the usual way. (for git, this is more commonly done using the existing accounts on the system, which may or may not be backed by LDAP).

Use your version control system's built-in access control mechanisms in the usual way to allow each team to read or write to each component as appropriate. If this matrix of permissions becomes too cumbersome to manage manually, build a custom mechanism to control it at a higher level and/or automate large-scale updates.

If you don't have a component architecture, centralized user accounts, or access control for your VCS, then give up on this problem until you fix the more important ones.

Zac Thompson
A: 

Give each team their own repository.

Chris McCall
A: 

OK so if I've understood your question correctly then you have
1. Multiple programs
2. Multiple teams that work on the source code
3. A need to track which teams are working on which code

You don't specify whether each time works on just one program or whether that program could be worked on collaboratively by multiple teams.

I'm not a GIT or SVN user, my main experience is with perforce but I'll try and outline some best practices that should work for any large enterprise level SCM system

  1. Have a main line code branch - this would be your 'clean' area.
  2. Create a sandbox branch for each team
  3. Use clear naming conventions

Each team sandbox would have full access rights to that sandbox only but not to any other team's sandbox.

So in terms of repository layout then have something like this

  • code
    • mainline
      • program1
      • program2
    • team
      • teamA
        • program1
      • teamB
        • program2

Each team would then branch the code for the stuff they want to work in into their team sandbox and be free to check-in whatever they like. If they soil their own branch then fine, it's their problem to sort out amongst themselves and run how they see fit.

Once a team is happy they have a good next iteration of their code then the team leader only has permission to promote and integrate the changes upto the mainline

Any decent SCM will show you integration history so for every change promoted into the mainline then you'll know which team integrated and when.

If a team is disbanded then you can abandon the branch if you wish
If a team name changes then create a new sanbox branch for the new team and then integrate the changes from the old team sandbox to the new sandbox

On perforce you can do some funky stuff like reverse integrates which is really nice because you can pull down the mainline into your branch, do any merges then integrate back to the mainline. As long as nothing else has changed on the code then the integrate just becomes a straight copy. This has the advantage of pushing the 'broken' mainline down into the sandbox and keeps the mainline clean

Anyway that's some thoughts from me. SCM management is whole area in itself and I'm sure many other people out there will have some better solutions but the above approach has worked for me

Some useful links on this complex area :)
Software Life Cycle Modelling
High-level best practices in SCM
SCM intro - repositories
SCM intro - branches

zebrabox
A: 

If it is a separate project, just give it a repository of its own.

ironfroggy