views:

118

answers:

3

After successfully converting an SVN repository to Git, I now have a very large Git repository that I want to break down into multiple smaller repositories and maintain history.

So, can someone help with breaking up a repo that might look like this:

MyHugeRepo/
   .git/
   DIR_A/
   DIR_B/
   DIR_1/
   DIR_2/

Into two repositories that look like this:

MyABRepo/
   .git
   DIR_A/
   DIR_B/

My12Repo/
   .git
   DIR_1/
   DIR_2/

I've tried following directions in this previous question but it doesn't really fit when trying to put multiple directories into a separate repo (http://stackoverflow.com/questions/359424/detach-subdirectory-into-separate-git-repository).

+4  A: 

This will setup MyABRepo; you can do My12Repo similarly of course.

git clone MyHugeRepo/ MyABRepo.tmp/
cd MyABRepo.tmp
git filter-branch --prune-empty --index-filter 'git rm --cached --ignore-unmatch DIR_1/* DIR_2/*' HEAD 

A reference to .git/refs/original/refs/heads/master remains. You can remove that up with:

cd ..
git clone MyABRepo.tmp MyABRepo

If all went well you can then remove MyABRepo.tmp.


If for some reason you get an error regarding .git-rewrite, you can try this:

git clone MyHugeRepo/ MyABRepo.tmp/
cd MyABRepo.tmp
git filter-branch -d /tmp/git-rewrite.tmp --prune-empty --index-filter 'git rm --cached --ignore-unmatch DIR_1/* DIR_2/*' HEAD 
cd ..
git clone MyABRepo.tmp MyABRepo

This will create and use /tmp/git-rewrite.tmp as a temporary directory, instead of .git-rewrite. Naturally, you can substitute any path you wish instead of /tmp/git-rewrite.tmp, so long as you have write permission, and the directory does not already exist.

unutbu
'git filter-branch' manpage recommends to create a fresh clone of rewritten repository instead of the last step mentioned above.
Jakub Narębski
@Jakub: Thanks for the correction.
unutbu
I tried this and got an error when it was trying to delete the .git-rewrite folder at the end.
MikeM
A: 

You could use git filter-branch --index-filter with git rm --cached to delete the unwanted directories from clones/copies of your original repository.

For example:

trim_repo() { : trim_repo src dst dir-to-trim-out...
  : uses printf %q: needs bash, zsh, or maybe ksh
  git clone "$1" "$2" &&
  (
    cd "$2" &&
    shift 2 &&

    : mirror original branches &&
    git checkout HEAD~0 2>/dev/null &&
    d=$(printf ' %q' "$@") &&
    git for-each-ref --shell --format='
      o=%(refname:short) b=${o#origin/} &&
      if test -n "$b" && test "$b" != HEAD; then 
        git branch --force --no-track "$b" "$o"
      fi
    ' refs/remotes/origin/ | sh -e &&
    git checkout - &&
    git remote rm origin &&

    : do the filtering &&
    git filter-branch \
      --index-filter 'git rm --ignore-unmatch --cached -r -- '"$d" \
      --tag-name-filter cat \
      --prune-empty \
      -- --all
  )
}
trim_repo MyHugeRepo MyABRepo DIR_1 DIR_2
trim_repo MyHugeRepo My12Repo DIR_A DIR_B

You will need to manually delete each repository’s unneeded branches or tags (e.g. if you had a feature-x-for-AB branch, then you probably want to delete that from the “12” repository).

Chris Johnsen
`:` is not a comment character in bash. You should use `#` instead.
Daenyth
@Daenyth, `:` is a traditional built-in command ( [also specified in POSIX](http://www.opengroup.org/onlinepubs/009695399/utilities/colon.html)). It is included in *bash*, but it is not a comment. I specifically used it in preference to `#` because not all shells take `#` as a comment introducer in all contexts (e.g. interactive *zsh* without the INTERACTIVE_COMMENTS option enabled). Using `:` makes the whole text suitable for pasting into any interactive shell as well as saving in a script file.
Chris Johnsen
A: 

Thanks for your answers but I ended up just copying the repository twice then deleting the files I didn't want from each. I am going to use the filter-branch at a later date to strip out all the commits for the deleted files since they are already version controlled elsewhere.

cp -R MyHugeRepo MyABRepo
cp -R MyHugeRepo My12Repo

cd MyABRepo/
rm -Rf DIR_1/ DIR_2/
git add -A
git commit -a

This worked for what I needed.

EDIT: Of course, the same thing was done in the My12Repo against the A and B directory. This gave me two repos with identical history up to the point I deleted the unwanted directories.

MikeM
This does not preserve commit history.
Daenyth
how so? I still have all the history, even for the deleted files.
MikeM