views:

67

answers:

3

I received some source code and decided to use git for it since my co-worker used the mkdir $VERSION etc. approach. While the past of the code currently seems unimportant, I'd still like to put it under git control as well to better understand the development process. So:

What is a convenient way to put those past versions into my already existing git repo? There is currently no remote repo so I don't mind rewriting history, but a solution that takes remote repositories into account will of course be preferred unless it is much more coplicated then. Bonus points for a script which does not need any more interaction based on either a directory or a archive file based history.

+1  A: 

The easiest approach is of course creating a new git repo, commiting the history to prepend first and then reapplying the patches of the old repo. But I'd prefer a solution which is less time consuming by automation.

Tobias Kienzler
+1  A: 

For importing the old snapshots, you find some of the tools in Git's contrib/fast-import directory useful. Or, if you already have each old snapshot in a directory, you might do something like this:

# Assumes the v* glob will sort in the right order
# (i.e. zero padded, fixed width numeric fields)
# For v1, v2, v10, v11, ... you might try:
#     v{1..23}     (1 through 23)
#     v?{,?}       (v+one character, then v+two characters)
#     v?{,?{,?}}   (v+{one,two,three} characters)
#     $(ls -v v*)  (GNU ls has "version sorting")
# Or, just list them directly: ``for d in foo bar baz quux; do''
(git init import)
for d in v*; do
    if mv import/.git "$d/"; then
        (cd "$d" && git add --all && git commit -m"pre-Git snapshot $d")
        mv "$d/.git" import/
    fi
done
(cd import && git checkout HEAD -- .)

Then fetch the old history into your working repository:

cd work && git fetch ../import master:old-history

Once you have both the old history and your Git-based history in the same repository, you have a couple of options for the prepend operation: grafts and replacements.

Grafts are a per-repository mechanism to (possibly temporarily) edit the parentage of various existing commits. Grafts are controlled by the $GIT_DIR/info/grafts file (described under “info/grafts” of the gitrepository-layout manpage).

INITIAL_SHA1=$(git rev-list --reverse master | head -1)
TIP_OF_OLD_HISTORY_SHA1=$(git rev-parse old-history)
echo $INITIAL_SHA1 $TIP_OF_OLD_HISTORY_SHA1 >> .git/info/grafts

With the graft in place (the original initial commit did not have any parents, the graft gave it one parent), you can use all the normal Git tools to search through and view the extended history (e.g. git log should now show you the old history after your commits).

The main problem with grafts is that they are limited to your repository. But, if you decide that they should be a permanent part of the history, you can use git filter-branch to make them so (make a tar/zip backup of your .git dir first; git filter-branch will save original refs, but sometime it is just easier to use a plain backup).

git filter-branch --tag-name-filter cat -- --all
rm .git/info/grafts

The replacement mechanism is newer (Git 1.6.5+), but they can be disabled on a per-command basis (git --no-replace-objects …) and they can pushed for easier sharing. Replacement works on individual objects (blobs, trees, commits, or annotated tags), so the mechanism is also more general. The replace mechanism is documented in the git replace manpage. Due to the generality, the “prepending” setup is a little more involved (we have to create a new commit instead of just naming the new parent):

INITIAL_SHA1=$(git rev-list --reverse master | head -1)
# detach HEAD at the end of the old history
git checkout old-history~0
# make a new commit that looks like the old initial commit
git rm -r -- .
git checkout $INITIAL_SHA1 -- .
git commit -C $INITIAL_SHA1
# replace the old initial commit with the newly created commit
git replace $INITIAL_SHA1 HEAD
# return to the previous branch (reattach HEAD)
git checkout -

Sharing this replacement is not automatic. You have to push part of (or all of) refs/replace to share the replacement.

git push some-remote 'refs/replace/*'

If you decide to make the replacement permanent, use git filter-branch (same as with grafts; make a tar/zip backup of your .git directory first):

git filter-branch --tag-name-filter cat -- --all
git replace -d $INITIAL_SHA1
Chris Johnsen
thanks, this works great for a small test subset, now off to the complete one :) (I used the replacement option)
Tobias Kienzler
This is not an issue for me at the moment, but I'll ask anyway: Using the replace-option up to the point before `git filter-branch`ing does not rewrite history and is therefore easier to share, right?
Tobias Kienzler
Without *git filter branch*, neither grafts, nor replacements actually rewrite history (they just produce an effect on the commit DAG as if they had rewritten history). The benefits of replacements are 1) they can be disabled by command line argument or environment variable, 2) they can be pushed/fetched, 3) they work on any object, not just the parent “attritubes” of commits. The ability to push replacements makes them easy to share via the normal Git protocols (you can share graft entries, but you have to use some "out of band" mechanism (i.e. not push/fetch) to propagate them).
Chris Johnsen
@Chris I just noticed that a file from the old version which I did not possess and therefore is not in my history got deleted, is it possible to undelete the file? Basically I search for the inversion of [How do I remove sensitive files from git’s history](http://stackoverflow.com/questions/872565/how-do-i-remove-sensitive-files-from-gits-history). Sidenote: using grafts, the deletion occurs at the original initial commit, using replace at the second original commit...
Tobias Kienzler
I asked this as a separate question ( [How to undelete a file previously deleted in git’s history?](http://stackoverflow.com/questions/3150394/how-to-undelete-a-file-previously-deleted-in-gits-history) ), just in case someone else wants to know
Tobias Kienzler
I just can't stop asking more questions... But [Can tags be automatically moved after a git rebase?](http://stackoverflow.com/questions/3150685/can-tags-be-automatically-moved-after-a-git-rebase) Rewriting history worked fine, but now my tags are on another timeline...
Tobias Kienzler
You can rewrite the tags with: `git filter-branch --tag-name-filter cat --original refs/original-tags-too -- --all`, but that will not completely do what you want if you have also done a [rebase in the interim](http://stackoverflow.com/questions/3150394/how-to-undelete-a-file-previously-deleted-in-gits-history/3150528#3150528) (it will only move the tags to the post-filter-branch commits, not to the post-rebase commits). I would suggest identifying and fixing the cause of the missing file in the original commits and then re-doing the replace/graft+filter-branch (this time also filtering tags).
Chris Johnsen
Reading your “undelete” question, I see the file in question was in the snapshots, but not in your Git history. If you have tags to bits of your Git history, this is what I would do: start with your original Git repository (before any filtering or rewriting; see `refs/original/` if you do not have a plain backup/clone), use `filter-branch --tag-name-filter cat --tree-filter … -- --all` (or `--index-filter`) to add the file to your history while rewriting its tags, then do the graft/replace and `git filter-branch --tag-name-filter cat -- --all` to permanently establish the graft/replacement.
Chris Johnsen
+1  A: 

If you don't want to change the commits in your repository, you can use grafts to override the parent information for a commit. This is what the Linux Kernel repo does to get history from before they started using Git.

This message: http://marc.info/?l=git&m=119636089519572 seems to have the best documentation that I can find.

You'd create a sequence of commits relating to your pre-git history, then use the .git/info/grafts file to make Git use the last commit in that sequence as the parent of the first commit you generated using Git.

Andrew Aylett
+1 ah yes, I see, thank you. This is detailed as the graft-option in [Chris Johnsen's answer](http://stackoverflow.com/questions/3147097/how-to-prepend-the-past-to-a-git-repository/3148117#3148117)
Tobias Kienzler