tags:

views:

291

answers:

2

I have a project that has existed in two SVN repositories. The second SVN repo was created simply by adding the repos from a checkout of the old SVN repository without SCM info stripped. The content of the files are byte identical, but there is no associated SCM meta-data.

I have taken the new SVN repo and ported it into a Git repo via git-svn. Now I would like to import the old repo and somehow get it to link the new repo so I can see the history across both. Is there a simple way to do this without hand stitching the two repos together?

+5  A: 

First, create a graft point to attach the two histories. Then run git filter-branch over the repository to make the change permanent. This will change the commit IDs of all commits downstream of the graft, note.

bdonlan
To be clear, you make two different git-svn repos, add both as a remote so all of the commits end up in one place. The graft point makes the parent of the oldest commit of the newer series the newest commit of the older series. You'll see it in log before you filter-branch.
Dustin
kan U plz show me the codez? (just trying to make the answers better)
shemnon
see my reply, and reply in similar question: http://stackoverflow.com/questions/1457248/how-do-i-re-play-my-commits-of-a-local-git-repo-on-top-of-a-project-i-forked-on/1458725#1458725
Jakub Narębski
+9  A: 

See also: How do I re-play my commits of a local git repo, on top of a project I forked on github.com? question (and my answer there), although the situation is slightly different, I think.


You have at least three possibilities:

  • Use grafts to join two histories, but do not rewrite history. This means that you (and anybody who has the same grafts) would have full history, while other users would have smaller repository. This also avoids problems with rewritten history if somebody already started working on top of converted repository with shorter history.

  • Use grafts to join two histories, check that it is correct using "git log" or "gitk" (or other git history browser/viewer), then rewrite history using git filter-branch; then you can remove grafts file. This means that everybody who clones (fetches) from rewritten repository would get full, joined history. But rewriting history is a big no if somebody already based work on converted short-history repository (but this case might not apply to you).

  • Use git replace to join two histories. This would allow people to select whether they want full history, or just current history, by choosing to fetch refs/replace/ (then they get full history) or not (then they get short history). Unfortunately this requires currently to use yet unreleased version of git, using development ('master') version, or one of release candidates for 1.6.5. The refs/replace/ hierarchy is planned for upcoming git version 1.6.5.


Below there are step by step instructions for all those methods: grafts (local), rewriting history using grafts, refs/replace/.

In all cases I assume that you have both current and historical repository history in single repository (you can add history from another repository using git remote add). I also assume that (one of) branch in short-history repository is named 'master', and that branch (commit) of the historical repository where you want to attach corrent history is called 'history'. You would have to substitute your own branch names (or commit IDs).

Finding commit to attach (root of short history)

First you have to find (SHA-1 identifier of) commit in short-history that you want to attach to full history. It would be the first commit in short history, i.e. the root commit (the commit without any parents).

There are two ways of finding it. If you are sure that you do not have any other root commit, you can find last (bottommost) commit in topological order, using:

$ git rev-list --topo-order master | tail -n 1

(where tail -n 1 is used to get last line of output; you don't need to use it if you don't have it).

If there is possibility of multiple root commits, you can find all parentless commits using the following one-liner:

$ git rev-list --parents master | grep -v ' '

(where grep -v ' ', that is space between single quotes, is used to filter out all commits which have any parents). Then you have to check (using e.g. "git show <commit>") those commits if there are more than one, and select one that you want to attach to earlier history.

Let's call this commit TAIL. You can save it in shell variable using (assuming that simpler method works for you):

$ TAIL=$(git rev-list --topo-order master | tail -n 1)

In the description below I would use $TAIL to mean that you have to substitute SHA-1 of bottommost comit in current (short) history... or allow shell to do the substitution for you.

Finding commit to attach to (top of historical repository)

This part is simple, we have to convert symbolical name of commit into SHA-1 identifier. We can do this using "git rev-parse":

$ git rev-parse --verify history^0

(where 'history^0' is used in place of 'history' just in case if 'history' is a tag; we need SHA-1 of commit, not of a tag object). Silimarly like finding commit to attach, lets name this commit ID TOP. You can save it in shell variable using

$ TOP=$(git rev-parse --verify history^0)

Joining history using grafts file

The grafts file, located in .git/info/grafts (you need to create this file if it doesn't exist, if you want to use this mechanism) is used to replace parent info for a commit. It is line based format, where each line contains SHA-1 of a commit we want to modify, followed by zero or more space-separated list of comits we want for given commit to have as parents; the same format that "git rev-list --parents <revision>" outputs.

We want $LAST commit, which doesn't have any parents, to have $TOP as its single parent. So in info/grafts file there should be line with SHA-1 of $LAST commit, separated by space by SHA-1 of $TOP commit. You can use the following one-liner for this (see also examples in git filter-branch documentation):

$ echo "$LAST $TOP" >> .git/info/grafts

Now you should check, using "git log", "git log --graph", "gitk" or other history browser that you joined histories correctly.

Rewriting history according to grafts file

Please note that this would change history!

To make history as recorded in grafts file permanent, it is enough to use "git filter-branch" to rewrite branches you need. If there is only single branch that needs to be rewritten ('master'), it can be as simple as:

$ git filter-branch $TOP..master

(this would process only minimal set of commits). If there are more branches affected by joining history, you can use simply

$ git filter-branch --all

Now you can delete grafts file. Check if everything is like you wanted, and remove backip in refs/original/ (see documentation for "git filter-branch" for details).

Using refs/replace/ mechanism

This is an alternative to grafts file. It has the advatage that it is transferable, so if you published short history, and cannot rewrite it (because other based their work on short history), then using refs/replace/ might be a good solution... well, at least when git version 1.6.5 gets released.

The refs/replace/ mechanism operates differently than grafts file: instead of modifying parents information, you replace objects. So first you have to create commit object which has the same properties as $TAIL, but has $TOP as a parent.

We can use

$ git cat-file commit $TAIL > TAIL_COMMIT

(the name of temporary file is only an example). Now you need to edit 'TAIL_COMMIT' file (it would look like this):

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

Now you need to add $TOP as parent, by putting line with "parent $TOP" (where $TOP has to be expanded to SHA-1 id!) between 'tree' header and 'author' header. After editing 'TAIL_COMMIT' should look like this:

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
parent 0f6592e3c2f2fe01f7b717618e570ad8dff0bbb1
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

If you want, you can edit commit message.

Now you need to use git hash-object to create new commit in the repository. You need to save result of this command, which is SHA-1 of a new commit object, for example like this:

$ NEW_TAIL=$(git hash-object -t commit -w TAIL_COMMIT)

(where '-w' option is here to actually write object to repository).

Finally use git replace to replace $TAIL by $NEW_TAIL:

$ git replace $TAIL $NEW_TAIL

Now what is left to check (using "git log" or some other history viewer) if the history is correct.

Now anybody who wants to have full history needs to add '+refs/replace/*:refs/replace/*' as one of pull refspecs.

Final note: I have not checked this solution, so YMMV

Jakub Narębski
also there's the possibility to use `git rebase` instead of grafting and then filtering
knittl
The solution using `git rebase --root --onto` is possible only if the fragment of new history you want to attach (append) to older history does not contain any merges, and is feasible only if said new history is not too long.
Jakub Narębski
I gave this one the check because it included code samples. #showMeTehCodez
shemnon