views:

214

answers:

2

I have a repository that I work in. There's one folder in there where I put all the stuff I want to open source so it's separate from the private parts. Is there a way to automatically get git to push anything committed to that folder to a github repository without me having to remember to push the newly changed files up there every time? I want to push the whole repository to another github location.

A: 

Use a githook to push on commit.

databyte
Wouldn't that push the whole repository? How can I extract that folder and its contents from there?
uliwitness
submodule the folder, posthook the push
databyte
+2  A: 

If you have a single repository where parts are intended to be public and parts are intended to be private, then you need to fundamentally change something in your repository setup. git tracks complete repositories, so you can either make a repository public or private, but not partly this and partly that.

If you have both a repo with the "public" files and a repo with the "private" files, you can add a git hook to the "public" repo to automatically push commits, and just keep the private repo private.

However, you are writing that you have a single repo containing both "public" and "private" files, so you need to split that into something "public" and something "private in some way.

You have a number of options to solve this situations:

  1. Split out the "public" folder into its own repository which you will push to github. This will rewrite the history of the "public" folder a little. I will outline this in more detail below.

  2. Create a branch that only concerns the "public" folder, and only publish that branch. This is risky in the "accidentally push, i.e. publish, private stuff" sense, and outright impossible or at least quite difficult to do if you have any commits that touch both "public" and "private" files, so I would advise against this option, and will not write more about it.

For splitting off the "public" folder into its own repository, create a new "public" branch off your "combined" branch, and use git filter-branch on it to make the new "public" branch contain only stuff from the "public" folder. The "Examples" section shows just the right --subdirectory-filter example). Then you will have both your old "combined" branch with both the "public" folder and the private stuff in it, and the new "public" branch with only the "public" folder.

Be aware that e.g. the commit messages in the new "public" branch still might contain "private" information. So you should go through all the commit messages, scan them for private information, and possibly redact out that private information, e.g. with git rebase -i.

Update: [The subsequent push of only the "public" branch and nothing else will probably not transfer any other information, so this repo cleanup is probably not needed.] If you needed to do any redactions, you will want to remove the old unredacted revs from the repo using git gc (probably with the --prune=0 and --aggressive options - but I can't find the SO answer with more info about that).

Now your "public" branch is ready for publication. To make sure it only contains the "public" information, you can push it into a new empty local bare repo, examine the content of that to verify all refs have no private information. After you are satisfied, you can push the "public" branch to a new empty repo on github. The repo on github will then ONLY contain the "public" branch, which you should probably name "master" on the github repo.

Your local repo with the "combined" branch still contains both public and private information directly, and has no connection to the new "public" github repo whatsoever.

Now you could rewrite the "combined" branch's history to just contain the non-public bits, but that would sacrifice all connections between the state of "public" and "private" files during all of the history, so repeatable builds of old stuff would become close to impossible. Therefore, I suggest to leave the "combined" branch's history alone, and just remove the "public" folder from it in a new commit.

If the integration between your private and your public files is very tight and version dependent, you can use git submodule to add a specific version of the "public" repo from github into your private repo. A new submodule folder named just like the former "public" folder will minimize the changes to your private stuff, as then all "public" files will be in their old path. Note that the submodule folder will not update automatically when something has been pushed to the github. You might work around that by adding a git hook to your local submodule folder which would update the submodule information in the "combined" repo.

If the integration between your private and your public files is more loose, you can also treat the public files like any external third-party project and integrate it into your private stuff in the way anybody else would integrate it, i.e. like any external piece of software your "private" software depends on.

ndim
Yes, the tight integration between the public and private stuff is why I wanted to keep them in one repo. The public stuff mostly gets changed when I work on the private stuff that uses it, so I don't wanna have to pull. Also, I want to be able to tag the state of the private repo, and include the current version of the public stuff.
uliwitness
Git submodules record the SHA of the commits of the submodule. If you tag the private repo, that tag will include all submodules' commit SHA. So you can get the desired tight coupling.
ndim