You could indeed use the subdirectory filter followed by an index filter to put the contents back into a subdirectory, but why bother, when you could just use the index filter by itself?
Here's an example from the man page:
git filter-branch --index-filter 'git rm --cached --ignore-unmatch filename' HEAD
This just removes one filename; what you want to do is remove everything but a given subdirectory. If you want to be cautious, you could explicitly list each path to remove, but if you want to just go all-in, you can just do something like this:
git filter-branch --index-filter 'git ls-tree --name-only --full-tree $GIT_COMMIT | grep -v "^directory-to-keep$" | xargs git rm --cached -r' -- --all
I expect there's probably a more elegant way; if anyone has something please suggest it!
A few notes on that command:
- filter-branch internally sets GIT_COMMIT to the current commit SHA1
- I wouldn't have expected
--full-tree
to be necessary, but apparently filter-branch runs the index-filter from the .git-rewrite/t
directory instead of the top level of the repo.
- grep is probably overkill, but I don't think it's a speed issue.
--all
applies this to all refs; I figure you really do want that. (the --
separates it from the filter-branch options)
Edit: thanks to Thomas, here's a commit filter to remove the now-empty commits. It can be used in the same command (just place it between the index filter and the --
):
--commit-filter 'if [ "$1" = "$(git rev-parse $3^{tree})" ]; then skip_commit "$@"; else git commit-tree "$@"; fi' "$@" --remap-to-ancestor
The --remap-to-ancestor
option keeps you from losing refs which pointed to skipped commits. (For example, if the tag v2.0
pointed to a commit which didn't touch this subdirectory, you'd probably want it remapped to the nearest ancestor which did, instead of just removing it.)