views:

266

answers:

1

One of the things I like about the way I have Subversion set up is that I can have a single main repository with multiple projects. When I want to work on a project I can check out just that project. Like this

\main
    \ProductA
    \ProductB
    \Shared

then

svn checkout http://.../main/ProductA

As a new user to git I want to explore a bit of best practice in the field before committing to a specific workflow. From what I've read so far, git stores everything in a single .git folder at the root of the project tree. So I could do one of two things.

  1. Set up a separate project for each Product.
  2. Set up a single massive project and store products in sub folders.

There are dependencies between the products, so the single massive project seems appropriate. We'll be using a server where all the developers can share their code. I've already got this working over SSH & HTTP and that part I love. However, the repositories in SVN are already many GB in size so dragging around the entire repository on each machine seems like a bad idea - especially since we're billed for excessive network bandwidth.

I'd imagine that the Linux kernel project repositories are equally large so there must be a proper way of handling this with Git but I just haven't figured it out yet.

Are there any guidelines or best practices for working with very large multi-project repositories?

+3  A: 

The guideline is simple, in regards to Git limits:

  • one repo per project
  • a main project with submodules.

The idea is not to store everything in one giant git repo, but build a small repo as a main project, which will reference the right commits of other repos, each one representing a project or common component of its own.

VonC
Also worth noting that if you include submodules into the main project, each submodule is it's own git repository, so you're free to include particular versions of the submodules, certain tags, etc.
Damien Wilson
@VonC: This sounds similar to the "externals" support provided by subversion. We tried this and found it extremely cumbersome to constantly update the version references in the externals since the projects are developed concurrently with dependencies on each other. Is there another option??
Paul Alexander
@Paul: yes, instead of updating the version from the main project, you either develop your subprojects directly from within the main project (see http://stackoverflow.com/questions/1979167/git-submodule-update/1979194#1979194), or you reference in a sub-repo an origin towards the same sub-repo being developed elsewhere: from there you just have to pull from that sub-repo the changes made elsewhere. In both case, you have to not forget to commit the main project, to record the new configuration. no "external" property to update. The all process is much more natural.
VonC
Honestly, this sounds like a real pain and anything that requires developers to do something manually each time is just going to be a regular source of bugs an maintenance. I suppose I'll look into automating this with some scripts in the super project.
Paul Alexander
@Paul: honestly, you may have be right... that is until latest Git release 1.7.1. (http://www.kernel.org/pub/software/scm/git/docs/RelNotes-1.7.1.txt) `git diff` and `git status` both learned to take into account submodules states even if executed from the main project. You simply cannot miss submodule modification.
VonC