views:

221

answers:

4

At first: This is (hopefully) no duplicate of this or this.

The current status: I committed a file with credentials for an internal database to my Git repository. This was fine, as I used it only alone. Then my group started to clone, push and pull around in this project. We now have several Git repositories (one central and some developers).

The problem: We now want to give public access to the source code, and to the Git repository or at least let Git manage the details of others contributing to the code.

The question: What would be a good strategy to

a) remove the file with the credentials from the central or from all repositories, or

b) set up a new Git repository as kind of 'interface' to the outer world?

If choosing (b), how could we easily communicate changes back to the main repository?

Due to the already widespread distribution, we'd really like to not do a git rebase or a git filter-branch on each and every current repository.

+2  A: 

Sorry, but you're stuck with running git filter-branch if you want to delete the credentials from the main repository. See Removing sensitive data, written by the folks at GitHub.

Due to git's design, there's no way to force existing clones to delete the file from their respective histories.

You could sanitize a single branch and make it the basis for future development:

$ git checkout -b old-master master
$ git filter-branch ... master

Now you'd need to push the sanitized master to a new repo that contains only the clean master:

$ git push new-central master

Existing repos can add the new remote and git cherry-pick changes from their old branches over to the new clean master if necessary.

For the new repository, put some sort of barrier in place to prevent someone pushing sensitive data to it so you don't have the same problem all over again. This barrier might be a human being who controls the new central repository and reviews all patches to decide what goes in.

Greg Bacon
+2  A: 

There's no way you can do a) without using rebase or filter-branch. But I'd say that it is probably BETTER to do it this way now, than to have to struggle with hiding history forever. I guess b) could be done by splitting the history after a commit that removes the credentials. The result would pretty much be two histories, placed in two different repos; one before the clean-up, and one that "restarts" just after. The history of these two repos could be connected through graft-points in the repos of those who need to reach the old history.

Either way, you're going to have to deal with a whole load of sh*t, and I'd recommend going for a) and filter-branch even if it is a lot of work.

kusma
Sorry to unaccept your answer, but the link to GitHub in gbacon's answer was worth its weight in gold.
Boldewyn
+1  A: 

Just change the password for your internal database, and any other service that has the same password. (and same for any other password that existed somewhere in your history).

hasen j
A: 

So, we're through and I'd like to share how we did it finally.

We were in the lucky position that noone had a custom branch at a certain moment. So what we basically did, was, that all pushed a last time their stuff to the central repository.

Then we used filter-branch like described by the GitHub crew on it. We had then a clear central repository.

Finally (and that worked only because noone had a local branch as mentioned) we deleted our local repositories and cloned new ones from the now clean central one.

To put it in a nutshell: This way it was quite a fast and painless procedure. Not elegant, but it worked.

Boldewyn