tags:

views:

236

answers:

2

I'm resurrecting a rather old code project, from when I was using CVS regularly, as a component in a new project that I've already been working on using git. I still have access to the CVS archive the old project's module is in, so I was just going to use git-cvsimport to get the commit history and go from there. However, this is just creating a new git repository inside of the current one. It's entirely possible I need to do this as a multistep process where I go CVS -> fresh git repository and then use something else to get it into the existing git repository.

Running this in newproj/newsubdir ($CVSROOT is already correctly set in my shell configuration):

git cvsimport -k -o master -u -s \- -A ~/Documents/cvs-authors.txt oldproj

gets me a brand new repository newproj/newsubdir/.git/ with all of the correct commits (comments, timestamps, history), and with HEAD where I want it.

What I want is for the historical CVS commits to be as if they were always in newproj/newsubdir/oldproj-file1, newproj/newsubdir/oldproj-file2, etc. In my experience, git has the magic to do this kind of thing, but I couldn't find an obvious fit to my situation.

+2  A: 

You have three options. All of them start with doing the clean cvsimport, so go ahead and do that.

  1. Reference that repo as a submodule.
  2. Fetch the repo into the existing repo and do a subtree merge to join the histories.
  3. Do something similar to #3, and then regraft the tree as to interleave the commits chronologically throughout history.

Number one means that the outer project relies on the inner, but is probably not desirable for you.

Number two is explained in this subtree merge howto. It might be good enough for you.


But if you like a nice clean linear history, you can do #3 and tangle them up for good. I did something similar in a cleanup project a while back and have a lot of the documentation and tools still there.

The basic idea was to separate all of the changes into a patch history that would reconstruct the changes. By default, this history is in a sort of repository order, but running the script I mentioned in the post will rearrange the patches into a new sequence in chronological order.

The tree hash should let you know you didn't break anything other than the lineage.

Were I to do this again, I'd possibly just emit a grafts file and do a filter-branch.

Dustin
D'oh you answered as I was answering myself. Curse my impatience!
UltraNurd
A: 

Figured out how to do what I want based on this answer for combining git repositories, using git filter-branch to make it as if the module imported from CVS had been merged directly into the subdirectory desired in the existing git repository

Starting from the directory containing newproj, the existing git repository:

% git cvsimport -k -u -s \- -A ~/Documents/cvs-authors.txt \
    -C newproj-sibling oldproj
% cd newproj-sibling
% git filter-branch --index-filter \
    'git ls-files -s | gsed "s-\t-&subdir/of/newproj/-" |
     GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
     git update-index --index-info &&
     mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
% cd ../newproj
% git pull ../newproj-sibling master

Assuming the target subdirectory in the git repository was completely new, or at least contained no files that shared names with those in the CVS module, the merge should go off without a hitch.

One caveat: I have gsed above because the BSD sed that comes with OS X can't do character escapes like \t, and I haven't bothered to alias it yet.

UltraNurd