views:

7326

answers:

3

I'm going to be working with other people on code from a project that uses cvs. We want to use a distributed vcs to make our work and when we finish or maybe every once in a while we want to commit our code and all of our revision history to cvs. We don't have write access to the project's cvs repo so we can't commit very frequently. What tool can we use to export our revision history to cvs? Currently we were thinking of using git or mercurial but we could use another distributed vcs if it could make the export easier.

+69  A: 

Fortunately for those of us who are still forced to use CVS, git provides pretty good tools to do exactly what you're wanting to do. My suggestions (and what we do here at $work):

Creating the Initial Clone

Use git cvsimport to clone the CVS revision history into a git repository. I use the following invocation:

% git cvsimport -d $CVSROOT -C dir_to_create -r cvs -k \
  -A /path/to/authors/file cvs_module_to_checkout

The -A option is optional but it helps to make your revision history that's imported from CVS look more git-like (see man git-cvsimport for more info on how this is set up).

Depending on the size and history of the CVS repository, this first import will take a VERY long time. You can add a -v to the above command if you want the piece of mind that something is in fact happening.

Once this process is completed, you will have a master branch that should reflect CVS's HEAD (with the exception that git cvsimport by default ignores the last 10 minutes worth of commits to avoid catching a commit that is half-finished). You can then use git log and friends to examine the entire history of the repository just as if it had been using git from the beginning.

Configuration Tweaks

There are a few configuration tweaks that will make incremental imports from CVS (as well as exports) easier in the future. These are not documented on the git cvsimport man page so I suppose they could change without notice but, FWIW:

% git config cvsimport.module cvs_module_to_checkout
% git config cvsimport.r cvs
% git config cvsimport.d $CVSROOT

All of these options can be specified on the command line so you could safely skip this step.

Incremental Imports

Subsequent git cvsimport should be much faster than the first invocation. It does, however, do an cvs rlog on every directory (even those that have only files in Attic) so it can still take a few minutes. If you've specified the suggested configs above, all you need to do is execute:

% git cvsimport

If you haven't set up your configs to specify the defaults, you'll need to specify them on the command line:

% git cvsimport -r cvs -d $CVSROOT cvs_module_to_checkout

Either way, two things to keep in mind:

  1. Make sure you're in the root directory of your git repository. If you're anyplace else, it will try to do a fresh cvsimport that will again take forever.
  2. Make sure you're on your master branch so that the changes can be merged (or rebased) into your local/topic branches.

Making Local Changes

In practice, I recommend always making changes on branches and only merging to master when you're ready to export those changes back to the CVS repository. You can use whatever workflow you like on your branches (merging, rebasing, squashing, etc) but of course the standard rebasing rules apply: don't rebase if anyone else has been basing their changes on your branch.

Exporting Changes to CVS

The git cvsexportcommit command allows you to export a single commit out to the CVS server. You can specify a single commit ID (or anything that describes a specific commit as defined in man git-rev-parse). A diff is then generated, applied to a CVS checkout and then (optionally) committed to CVS using the actual cvs client. You could export each micro commit on your topic branches but generally I like to create a merge commit on an up-to-date master and export that single merge commit to CVS. When you export a merge commit, you have to tell git which commit parent to use to generate the diff. Also, this won't work if your merge was a fast-forward (see the "HOW MERGE WORKS" section of man git-merge for a description of a fast-forward merge) so you have to use the --no-ff option when performing the merge. Here's an example:

# on master
% git merge --no-ff --log -m "Optional commit message here" topic/branch/name
% git cvsexportcommit -w /path/to/cvs/checkout -u -p -c ORIG_HEAD HEAD

You can see what each of those options mean on the man page for git-cvsexportcommit. You do have the option of setting the -w option in your git config:

% git config cvsexportcommit.cvsdir /path/to/cvs/checkout

If the patch fails for whatever reason, my experience is that you'll (unfortunately) probably be better off copying the changed files over manually and committing using the cvs client. This shouldn't happen, however, if you make sure master is up-to-date with CVS before merging your topic branch in.

If the commit fails for whatever reason (network/permissions issues, etc), you can take the command printed to your terminal at the end of the error output and execute it in your CVS working directory. It usually looks something like this:

% cvs commit -F .msg file1 file2 file3 etc

The next time you do a git cvsimport (waiting at least 10 minutes) you should see the patch of your exported commit re-imported into your local repository. They will have different commit IDs since the CVS commit will have a different timestamp and possibly a different committer name (depending on whether you set up an authors file in your initial cvsimport above).

Cloning your CVS clone

If you have more than one person needing to do the cvsimport, it would be more efficient to have a single git repository that performs the cvsimport and have all the other repositories created as a clone. This works perfectly and the cloned repository can perform cvsexportcommits just as described above. There is one caveat, however. Due to the way CVS commits come back through with different commit IDs (as described above), you don't want your cloned branch to track the central git repository. By default, this is how git clone configures your repository but this is easily remedied:

% git clone [CENTRAL_REPO_HERE]
% cd [NEW_GIT_REPO_DIR_HERE]
% git config --unset branch.master.remote
% git config --unset branch.master.merge

After you've removed these configs, you will have to explicitly say where and what to pull from when you want to pull in new commits from the central repository:

% git pull origin master


Overall, I've found this work-flow to be quite manageable and the "next best thing" when migrating completely to git isn't practical.

Brian Phillips
Thank you. This has helped me immensely, especially the tip about using a non-fastforward merge to bundle up changes for CVS into a single commit.
skiphoppy
For the record, the -A option you suggest using for git-cvsimport is mentioned in the manpage as being "not recommended ... if you intend to export changes back to CVS again later with git-cvsexportcommit(1)." In my case, I actually like the way the authors come out as is.
skiphoppy
+1 for concise answer, usable to us.
Thorbjørn Ravn Andersen
One thing is a little unclear about the instructions: Make sure you're on your master branch so that the changes can be merged (or rebased) into your local/topic branches.Doesn't the -r option ensure that the changes get put into the cvs/master branch? If so, why does it matter which branch I'm currently on when I do more cvsimport's? The docs for cvsimport are a bit thin; they don't give any hint that the current directory at the time of import makes a difference. I wonder what other important info I might be missing.
Dave Abrahams
The reason one wants to be on the master branch when doing a cvsimport is that it's analogous to doing a `git pull` (vs. a `git fetch`). It not only fetches revisions from CVS, it merges them into the current branch. Importing CVS revisions into the master branch allows easy exporting later when you merge your topic branch and do a CVS export of the merge commit as described.
Brian Phillips
Thanks for the tips. One handy thing I've done is add "cvs = !git cvsimport -k -a" under [alias] in my .gitconfig. This makes it so that "git cvs" will DTRT (from top of tree).
bstpierre
+2  A: 

In addition to Brian Phillips answer: there is also git-cvsserver which functions like CVS server but actually access git repository... but it has some limitations.

Jakub Narębski
+4  A: 

You should not trust cvsimport blindly and check if the imported tree matches what is in the CVS repo. I've done this by sharing the new project using eclipse CVS' plug-in and found that there were inconsistencies..

Two commits that were done in less than a minute with the same commit message (in order to revert a wrongfully deleted file) were grouped into one big commit, which resulted in a missing file from the tree..

I was able to solve this problem by modifying the 'fuzz' parameter to less than a minute.

example:

% git cvsimport -d $CVSROOT -C dir_to_create -r cvs -k \
  -A /path/to/authors/file cvs_module_to_checkout -z 15

bottom line: check your tree after importing

shil88