views:

385

answers:

1

I'm contributing to a fairly small open source project hosted on Github. So that other people can take advantage of my work, I've created my own fork on Github. Despite Github's choice of terminology, I don't wish to totally diverge from the main project. However, I don't expect or desire that all of my work is accepted into the main repository. Some of it however, already has been merged into the main repository and I expect this to continue. The problem I am running into is how best to keep our two trees in a state where code can be shared between them easily.

Some situations I have or will encountered include:

  • I commit code that is later accepted into the main repository. When I pull from this repository in the future, my commit is duplicated in my repository.
  • I commit code that is never accepted into the main repository. When I pull from this repository in the future, the two trees have diverged and fixing it is hard.
  • Another person comes along and bases their work on my repository. Thus, I should if at all possible avoid changing commits that I have pushed, for example by using git rebase.
  • I wish to submit code to the master repository. Ideally, my changes should easily be able to be transformed into patches (ideally using git format-patch) that can directly and cleanly apply to the master repository.

As far as I can tell there are two, or possibly three ways to handle this, none of which work particularly well:

  • Frequently run git rebase to keep my changes based off the head of the upstream repository. In this way I can eliminate duplicated commits but often have to rewrite history, causing problems for people wanting to derive their work from mine.
  • Frequently merge the upstream repository changes into mine. This works ok on my end but does not seem to make it easy to submit my code to the upstream repository.
  • Use some combination of these and possibly git cherry-pick to keep things in order.

What have other people done in this situation? I know my situation is analogous to the relationship between various kernel contributors and Linus's main repository, so hopefully there are good ways to handle this. I'm fairly new to git though, so haven't mastered all it's nuances. Finally, especially due to Github, my terminology may not be entirely consistent or correct. Feel free to correct me.

+11  A: 

Some tips I've learned from a similar situation:

  • Have a remote tracking branch for the upstream author's work.
  • Pull changes from this tracking branch into your master branch every so often.
  • Create a new branch for each of the topics you're working on. These branches should generally be local only. When you get changes from upstream into master, rebase your topic branches to reflect these changes.
  • When you're done with some topic work, merge into master. This way, people who are deriving work from yours, will not see too much rewritten history, since the rebasing occurred in your local topic branches.
  • Submitting changes: Your master branch will basically be a series of commits, some of which are the same as upstream, the rest are yours. The latter can be sent as patches if you want to.

Of course, your choice of branch names and remotes are your own. I'm not sure these are exhaustive to the scenario, but they cover most of my hurdles.

sykora
In the time since I asked this question I've gained a lot more experience with Git and have approximately settled on this as the best answer. Keeping changes under development only as local commits is the key. Definitely a very good tip. Thanks.
spectre256
Have you ever worked on a topic branch on more than one computer? If so, what did you do?
Andrew Grimm
No, I haven't worked on topic branches across computers, but it's not the "across computers" part that is difficult. It's the "across people" that is. You can just as easily push your topic branches to the server so _you_ can work with it.
sykora