tags:

views:

1827

answers:

3

I have a repository which I have already cloned from SVN. I've been doing some work in this repository in its Git form and I would hate to lose that structure by cloning again. However, when I originally cloned the repository, I failed to correctly specify the svn.authors property (or a semantically-similar option). Is there any way I can specify the SVN author mappings now that the repository is fully Git-ified? Preferably, I would like to correct all of the old commit authors to represent the Git author rather than the raw SVN username.

+2  A: 

You probably want to look into git-filter-branch, specifically the --commit-filter option. This command is a powerful chainsaw that can rewrite your entire repository history, changing whatever you might want to change.

Note that when you do this, you should pull new clones from the updated repository since the SHA1 hashes of every commit may have changed.

Greg Hewgill
+3  A: 

git filter-branch can be used to rewrite large chunks of history.

In this case, you would probably do something like (totally untested):

git filter-branch --env-filter '
    GIT_AUTHOR_NAME=`echo "${GIT_AUTHOR_NAME}" | sed -e "s/svnname1/Right Name/; s/svnname2/Correct Name/"`
    GIT_COMMITTER_NAME=`echo "${GIT_COMMITTER_NAME}" | sed -e "s/svnname1/Right Name/; s/svnname2/Correct Name/"`
    GIT_AUTHOR_EMAIL=`echo "${GIT_AUTHOR_EMAIL}" | sed -e "s/svnname1/[email protected]/; s/svnname2/[email protected]/"`
    GIT_COMMITTER_EMAIL=`echo "${GIT_COMMITTER_EMAIL}" | sed -e "s/svnname1/[email protected]/; s/svnname2/[email protected]/"`
'

As always, the following applies: in order to rewrite history, you need a conspiracy.

Jörg W Mittag
+15  A: 

Start out by seeing what you've got to clean up:

git shortlog -s

For each one of those names, create an entry in a script that looks like this (assuming you want all the authors and committers to be the same):

#!/bin/sh

git filter-branch --env-filter '

n=$GIT_AUTHOR_NAME
m=$GIT_AUTHOR_EMAIL

case ${GIT_AUTHOR_NAME} in
        user1) n="User One" ; m="[email protected]" ;;
        "User Two") n="User Two" ; m="[email protected]" ;;
esac

export GIT_AUTHOR_NAME="$n"
export GIT_AUTHOR_EMAIL="$m"
export GIT_COMMITTER_NAME="$n"
export GIT_COMMITTER_EMAIL="$m"
'

That's basically the script I used for a large rewrite recently that was very much as you described (except I had large numbers of authors).

edit Use π pointed out a quoting problem in my script. Thanks!

Dustin
Should be export GIT_AUTHOR_NAME="$n" or only the authors first name will end up in the index!
pi