I've got a repository that effectively contains a bunch of different modules. I'd like to split it out into separate repositories, keeping the version history of the files in those repositories.
A simple approach to this problem would just involve cloning the repo and then doing something like
git filter-branch \
--tree-filter $'find -type f \
| grep -vF <(echo "file1\nfile2\n...") -- --all \
| xargs rm' --prune-empty -- --all
but that will (assuming that my untested script was written correctly) delete all files with the given names.
What I really want to do is to walk the commit history, finding and deleting files which have not become any of those files. So if file_a
was renamed to file_b
14 commits ago, and (the current) file_b
is supposed to be part of this repo, those old file_a
's should be kept in the repo as well.
This should extend in both directions; i.e. if there is another branch in which file_a
was never renamed, it should really —err, actually that's a bit ambiguous. The definition of file_a
is dependent on a particular branch. What I want (I think..) is to specify a set of blobs, e.g. HEAD:file_b
, and have the filter eliminate all blobs that are not part of the history of one of those blobs.