ansaurus

Question

Is it possible to keep an unversioned file in a git repository

Answer 1

+3 A:

Short answer: no.

More useful answer: Git doesn't track files individually, so asking it to throw away the history of a single file would mean that it would have to rewrite all of its history completely upon every commit, and that leads to all kinds of ugly problems.

You can store a file in an annotated tag, but that's not very convenient. It basically goes like this:

ID=`git hash-object -w yourfile.sqlite`
git tag -a -m "Tag database file" mytag $ID

In no way does that conveniently update (or even create) the database file in the working tree for you... you'd have to use hook scripts to emulate that.

Full disclosure: I'm not completely sure whether it's actually possible to push tagged blobs that aren't covered by the normal history. I suspect that it isn't, in which case this recipe would be a lot less than useful.

Jan Krüger 2010-02-12 13:21:59

Answer 2

+2 A:

You can always use .gitignore config file for this - from the beginning.

And ... (from this thread: http://n2.nabble.com/purging-unwanted-history-td1507638.html kudos for Björn Steinbrink!)

Use filter-branch to drop the parents on the first commit you want to keep, and then drop the old cruft.

Let's say $drop is the hash of the latest commit you want to drop. To keep things sane and simple, make sure the first commit you want to keep, ie. the child of $drop, is not a merge commit. Then you can use:
git filter-branch --parent-filter "sed -e 's/-p $drop//'" \ 
    --tag-name-filter cat -- \ 
    --all ^$drop 
The above rewrites the parents of all commits that come "after" $drop.

Check the results with gitk.

Then, to clean out all the old cruft.

First, the backup references from filter-branch:
git for-each-ref --format='%(refname)'refs/original | \ 
    while read ref 
    do 
            git update-ref -d "$ref" 
    done 
Then clean your reflogs:
git reflog expire --expire=0 --all 
And finally, repack and drop all the old unreachable objects: git repack -ad git prune # For objects that repack -ad might have left around

At that point, everything leading up to and including $drop should be gone.

Zsolt Botykai 2010-02-12 13:23:56

I am looking to a solution which *keeps* a copy of the db in the repository

Benoit Vidis 2010-02-12 13:30:12

Then you can create a script which removes the history after every commit.

Zsolt Botykai 2010-02-12 13:48:52

Answer 3

+2 A:

It sounds like you're looking for the solution to the wrong problem.

Large binary files do often need to be stored in repositories, but I don't think a SQLite database is something you would really need to store in its binary form in a repository.

Rather, you should keep the schema in version control, and if you need to keep data too, serialize it (to XML, JSON, YAML...) and version that too. A build script can create the database and unserialize the data into it when necessary.

Because a text-based serialization format can be tracked efficiently by Git, you won't worry about the space overhead of keeping past versions even if you don't think you need access to them.

Ben James 2010-02-12 13:25:39

doing so would allow git to apply its usual compression and diffing techniques making this much less painful. The only thing to take care of would be creating a properly sorted serialization format that would minimize the size of the diff.

David Schmitt 2010-02-12 13:31:13

I do not agree. If you look at the slite format, it is not that binary.Git is perfectly able to generate some usable diffs with it. The only benefit would be that diffs would be easier to read in case of conflict.Having to handle a text serialization layer is far too much work if you ask me

Benoit Vidis 2010-02-12 13:50:49

ansaurus

tags:

views:

answers:

Is it possible to keep an unversioned file in a git repository

related questions