views:

1562

answers:

5

I've got two different exports of our CVS repository into git. They diverge at some point, and I'm doing some investigation into why. The development line goes back several years and over tens of thousands of commits.

At the beginning of the development line, the SHA1 IDs for each commit are identical, telling me that git-cvsimport is very consistent about what it is doing when it reads the results of cvsps and imports.

But sometime between the first commit and yesterday, the SHA1 IDs begin to diverge. I'd like to find out where this is by comparing a list of commit IDs from each repository and looking to see what's missing. Are there any good tools or techniques for doing this?

+3  A: 

The obvious way would be to clone one repository, fetch the tip of the main branch of the other repository into the clone and use git merge-base on the two tips to find the common ancestor.

Charles Bailey
+3  A: 

Why not just cat each repo's git log to its own file and compare? The difference closest to the bottom of the file is where they started diverging...

Jason Punyon
This is actually what I wound up doing yesterday, although today I'm getting a little more advanced with some suggestions from the other answers.
skiphoppy
+9  A: 

Since the two git repositories start out the same, you can pull both into the same working repository.

$ git remote add cvsimport-a git://.../cvsimport-a.git
$ git remote add cvsimport-b git://.../cvsimport-b.git
$ git remote update
$ git log cvsimport-a/master..cvsimport-b/master  # in B, not in A?
$ git log cvsimport-b/master..cvsimport-a/master  # in A, not in B?
ephemient
You're probably going to need cvsimport-a/master and cvsimport-b/master in those commands. Also, you can do "git remote update" instead of piping to xargs git fetch.
Brian Campbell
I was originally going to write `git remote fetch cvsimport-a; git remote fetch cvsimport-b` (not updating any other remotes), but then I changed my mind... done. Git complains a little if you leave off /master but it works. I'll change it for clarity.
ephemient
+5  A: 

In one of your repos, do the following:

$ git remote add other-repo git://.../other-repo.git
$ git remote update
$ git log other-repo/master...master
  # or maybe:
$ gitk master... other-repo/master

This gives you the symmetric difference between two repos; this shows you all of the commits that are in one repo, but not the other.

Brian Campbell
A: 

You can put (fetch) both repositories into single repository, either new one, or one of existing, as in responses by ephemient, Charles Bailey and Brian Campbell. Or you can use trick from Tips and Tricks page on Git Wiki, or to be more exact "How to compare two local repositories" question there.

But it might be simpler to generate list of SHA-1 of commits in topological order using "git rev-list" (or "git-log" with appropriate options), and check the first revision they differ.

Jakub Narębski