views:

42

answers:

2

I've got some code under version control (using mercurial), and would like to share some of it, whilst hiding other parts which I cannot release into the public domain (at least at this stage).

I'd ideally like to keep the revision history of the public code intact, and, more importantly, be able push/pull changes between the public repository and the repository containing both public and private code. It should not, however, be possible to recover any of the private info from the public repository history.

From what I've gleaned so far, it should be possible to extract the public stuff using hg convert with a filemap and excludes, although this would change all the revision ids and preclude any interaction between the two repositories.

For completeness I guess I should add that the repository was originally converted from cvs.

Would be grateful for any ideas,

+3  A: 

It is not always practical, but if the public part of your repo can be limited (or move to) to a subdirectory of your current repo, then you could:

  • extract (with for instance, like you mentioned, hg convert) that subdirectory in a repo of its own
  • reference that new repo as a subrepo for your main repo.

You would then manage two repos:

  • one public (with only the public files in it)
  • one private (with a reference to the public repos as a subrepo)
VonC
+1 separate repo
sylvanaar
@user488551: are the private "bits" code? or config files with sensitive or specific data?
VonC
They are code, and generally represent functionality which we don't want to expose because it involves new algorithms which we either a) want to publish in a journal first or b) which are being exclusively licensed to a company.
@user488551: that would likely involve a re-design for those part, in order to separate cleanly the public functions from the privates ones. So there is no "easy" solution provided by a Version Control tool...
VonC
A: 

If you can use subrepos, that's probably the best way to go, but using convert need not preclude interaction between the pieces. If the public and private stuff is completely disjoint, use convert to split the original repo into two completely disjoint subsets (regenerating all changeset IDs), then recreate your "superset" repo by cloning one and pulling the other (using --force to overcome hg's objection to unrelated repositories). You'll end up with a slightly unconventional repo which has two parent-less changesets and two heads. Merge the heads and you have a unified view of public and private again, with the public repo's ancestry effectively on a branch of its own.

shambulator
What do you mean by completely disjoint? My current plan is to scrub the history from the public repository (ie making a fresh repository with the public subset of sources). If I could then pull (with --force) from here into the private repository, which would have identical copies of all the public files, along with their histories, I might be in business. That way I'd only be keeping the (pre-split) history in the private repo, but that should be enough to be able to go back and find where any bugs crept in.
Sorry, "disjoint" was a bit vague. I meant, if your public code is all in files different from all of the private code. If that's the case, then you can use a `filemap` to split the two pieces completely away from each other without mixing their ancestry together again (i.e. each changeset touches only one of public and private, never both). The private repo needn't have its own copies (in the sense of different changesets) of public files; you can `convert` the private stuff into its own repo first, then pull in the public one and merge. Will add an example to my answer.
shambulator
Ah, just saw your comment about having public and private code mixed in the same files. I think @VonC is right; deliberate refactoring is a better idea than trying to split the repo, then having to refactor to un-break things anyway.
shambulator