Should we have one single repository for all the company, which contains many development projects, or a repository per project? Any ideas on experience / best practice?
If you are using Subversion, you could have multiple repositories under the same domain with multiple projects each.
So it'd look like this
Project1: svn://svn.company.com/project1
Project2: svn://svn.company.com/project2
Project1 might address front-end code and contains sub-projects such as admin/ and web/ Project2 would be backend and contains various tools and monitoring applications
If you use something like Git, each repository should be a single project.
I would have a repository for project, just for the sake of the revision numbers being per-project. You can setup SVN to be served so that all of the projects can be accessed from a single access point using Apache and DAV SVN (with SVNParentPath directive).
I've found that having a single Subversion repository aids in:
- Transparency: it is easier to follow what is going on, and find code even in projects you may not be directly involved in.
- Maintenance: it is not necessary to create repositories every time you wish to create a new project, and you can delete entire projects without fear of losing the record from Subversion.
- Maintenance: it is only necessary to have one repository backed up.
EDIT: Additional reasons:
- Global revision ids - by having revision ids be global it is:
- Easier to communicate (e.g. in a code review request, just specify the revision id, without a need to specify which project).
- Easier to guarantee atomicity when projects have dependencies on each other.
- Easier to see the order of commits to different projects.
I strongly recommend not using a repository per project. Within a single subversion repository, you can move and copy data and it will retain its history. If you decide that some piece of code that started as a backend tool is being merged into the front end, and those are in different repositories, this isn't an operation that subversion will know anything about. It will be as if you added something to the front end from no where. This also means that you can't do merges if you need to temporarily maintain two versions of related code.
You will note that all the examples in the svn book assume everything is in one repository.
There's a separate question of whether to use /trunk/project1, /trunk/project2, /branches/project1-r1 or to use /project1/trunk, /project1/branches/r1, /project2/trunk. Generally, the second is easier to keep track of.
The best reason to not put everything in one repository is that access control is hard to do more finely grained than at the repository level. Possible, yes, but not as easy. We currently have two main repositories:
- svn/code for all software development, requires jira ticket to commit
- svn/ops for any config, set up scripts, third party tars, whatever, no jira required
The decision may be based on project complexity as well. If you're starting a huge effort that will be seprarate from most everything else you'll work on, and it will have a large dev team working on it, it may make sense to give it a separate repo.
For teams working on a number of smaller projects at once, a single repository provides a simple mechanism to manage everything in one place (backup, search etc). The central repo also makes the repository itself seem a bit more "concrete" since it won't be discarded once a project is complete or abandoned.
It would depend on how interdependent your applications/projects are in the organization, and also how big the codebases are.So it boils down to how one defines a "project" in your organization. Shared codebase, shared components, shared teams, shared release cycles, could all influence how your repositories need be structured.
For simplicity, I define a project as having an independent codebase, release timeline, and/or belonging to one tool/application.If this is the case, I prefer having one repository per project. This way there is more freedom to define and implement policies/access restrictions per project.
If I had a single repository bloating over time ..I may also need to worry about the workspace checkout time etc., especially if the code for the projects are all mixed. If an application/tool/project is completed/shelved/obsoleted/migrated, there can be more control on the administration aspects if there is separation at project level.
Structuring a project repository based on its unique needs can also be easier if there is one repository per project. Each project may have specific needs, which may affect the configuration management policies (branching, tagging, naming conventions, CI etc included) followed for each project. This is not to say you cannot do all of this folder-wise within a single repository. You can. It just keeps things simpler to have a cleaner separation - that is all.
You may want to consider all these factors in order to evaluate the administrative overheads you may end up with, based on your specific setup.
This said, if the projects are all small in size, and are interdependent , requiring cross references, cross tracking, movement within each other etc., then you are better off with a single repo and one folder per repository.
Personally I would definitely prefer separate repository per project. There are several reasons:
Revision numbers. Each project repository will have separate revisions sequence.
Granularity. With repository per project you just can't make a commit into different projects having the same revision number. I assume this more as advantage, while someone would say that it is a flaw.
Repository size. How large is your project? Does it have binaries under source control? I bet it has. Therefore, size is important - each revision of binary file increases size of the repository. Eventually it becomes clumsy and it's hard to support. Fine-grained policy of binary files storage should be supported, and additional administration provided. As for me, I still can't find how could I completely delete binary file (committed by some stupid user) and its contents history from repository. With repository per project it would be easier.
Inner repository organization. I prefer fine-grained, very organized, self contained, structured repositories. There is a diagram illustrating general (ideal) approach of repository maintenance process. I think you would agree that it is just NOT POSSIBLE to use 'all projects in one repo' approach. For example, my initial structure of repository (every project repository should have) is:
/project /trunk /tags /builds /PA /A /B /releases /AR /BR /RC /ST /branches /experimental /maintenance /versions /platforms /releases
Repo administration. 'Repository per project' has more possibilities in users access configuration. It is more complex though. But there is also helpful feature: you can configure repositories to use the same config file
Repo Support. I prefer backing up repositories separately. Somebody says that in this case it is not possible to merge info from one repo into the other. Why on earth you would need that? The case when such merge is required shows that initial approach to source control is wrong. Evolution of the project assumes subsequent project separation into submodules, not the other way. And I know that there is approach to do that.
svn:externals. 'Repository per project' approach encourages svn:externals usage. That is a healthy situation when dependencies between project and submodules is established via soft links, which svn:externals is.
Conclusion. If you want to keep source control simple, use one repository. If you want to make software configuration management RIGHT:
- Use 'repository per project' approach
- Introduce separate role of software configuration manager and assign team member to it
- Hire good administrator, who can cope with all subversion tricks :)
PS. By the way, in spite I work with SVN all the time and I like it, 'repository per project' approach is why I see DCVS systems more attractive from the repository organization point of view. In DCVS repo is the project by default. There is even no question 'single vs multiple' possible, it would be just nonsense.
Very good, insightful answers so far, but I just want to add my two cents:
I think it depends on the size of your project. We have tried both approaches, but finally settled with one large repository.
Cons:
- Branching can be difficult, as many folders and files can be involved
- The repository can become HUGE
- You have to check out the entire repository (or at least specific branches) to build your solution
- A little less control of project architecture (see pros)
Pros:
- All projects share the same repository history
- Branching is for the entire repository
- Much less administration (individual developers can create new projects without the help of a sysadmin)
- Better overview and monitoring options of repository commits
IMHO (and regarding our particular project) the pros outweight the cons.
It all depends on how many projects you have, the size of your organisation, how related the projects are, and so on. If you're part of a large team with a couple of dozen projects, a repository per project is going to be pretty unwieldy to administer.
Depending on what type of work your organisation does, another option is to have one repository per client. Where I work, we currently have four repositories - one for projects that are completely internal to us, one for the software we sell, and two that are completely specific to particular clients for whom we produce custom software applicable only to them. This is an attempt at a 'best of both worlds' solution - we don't have one massive repository that contains everything and has a revision number in the billions (I exaggerate obviously!), but nor do we have dozens of piddly little repositories lying around with one project in each.
For my organization I went midway, creating several repositories for different coding "areas". I knew in advance that each repository would contain projects that would be fairly separate from each other.
I've done both and in both cases I sometimes wish I had chosen the other method...
I've settled on one repository is a good solution. Note that if I was at a larger company I'd reconsider.