When working with a SCM system, when should you branch?
When you need to make significant and/or experimental changes to your codebase, particularly if you want to commit intermediate changes, without affecting trunk.
There are several uses for branching. One of the most common uses is for separating projects that once had a common code base. This is very useful to experiment with your code, without affecting the main trunk.
In general, you would see two branch types:
Feature Branch: If a particular feature is disruptive enough that you don't want the entire development team to be affected in its early stages, you can create a branch on which to do this work.
Fixes Branch: While development continues on the main trunk, a fixes branch can be created to hold the fixes to the latest released version of the software.
You may be interested in checking out the following article, which explains the principles of branching, and when to use them:
When you need to make changes, based on your current branch, not destined for the next release from that branch (and not before).
For example, we work on trunk usually. Around the time of release, someone's going to need to make a change that we don't want in the current release (it may be before release, at the moment it's usually after release). This is when we branch, to put the release on its own branch and continue development for the next release on trunk.
There are various purposes for branching:
- Feature/bug branches. Dynamic and active branches that get moved back into the trunk when the feature/bugfix is complete.
- Static branches (tags in Subversion, though in essence just a 'normal branch'). They provide a static snapshot of say, a release. Even though they could be worked on, they remain untouched.
It also depends on the SCM tool you are using. Modern SCMs (git, mercurial, etc.) make it increasingly easy to create and destroy branches whenever needed. This allows you to, for example, make one branch per bug that you are working on. Once you merge your results into the trunk, you discard the branch.
Other SCMs, for example subversion and CVS, have a much "heavier" branching paradigm. That means, a branch is considered appropriate only for something bigger than a twenty-something-line patch. There, branches are classically used to track entire development tracks, like a previous or future product version.
It depends on what type of SCM you're using.
In the newer distributed versions (like git and mercurial), you're creating branches all the time and remerging anyway. I'll often work on a separate branch for a while just because someone's broken the build on the mainline, or because the network's down, and then merge changes back in later when it's fixed, and it's so easy to do that it's not even annoying.
The document (short and readable) that most helped me understand what was going in in the distributed systems is: http://mercurial.selenic.com/wiki/UnderstandingMercurial.
In the older systems with a central repository, (like CVS, SVN and ClearCase), then it's a much more serious issue which needs to be decided at a team level, and the answer should be more like 'to maintain an old release whilst allowing development to continue on the main line', or 'as part of a major experiment'.
The distributed model is much better, I think, and lacking only nice graphical tools to become the dominant paradigm. However it's not as widely understood, and the concepts are different, so it can be confusing for new users.
Leaving all the technicalities aside.....
Branch when you know its easier to merge back!
Keeping in mind that merging will always be effected with how the work is carried out in a project.
Once this achieved all the other tertiary issues will come in to play.
See Eric Sink's article on branching from his Source Control How-to.
Whenever you feel like it.
You probably won't very frequently if you work with a centralized SCM since the branches are part of the official repository, and that doesn't really invite much experimentation, not to mention that merges really hurt.
OTOH, there's no technical difference between a branch and a checkout in distributed SCMs, and merges are a lot easier. You'll feel like branching a whole lot more often.
In general term, the main purpose of branching (a VCS - Version Control System - feature) is to achieve code isolation.
You have at least one branch, which can be enough for sequential development, and is used for many tasks being recording (committed) on that same unique branch.
But that model shows quickly its limit:
When you have a development effort (refactoring, evolution, bug-fixes, ...) and you realize you cannot safely make those changes in the same branch than your current development branch (because you would break API, or introduce code that would break everything), then you need a another branch.
(To isolate that new code for the legacy one, even though the two code sets will be merge later on)
So that is your answer right there:
You should branch whenever you cannot pursue and record two development efforts in one branch.
(without having an horribly complicated history to maintain).
A branch can be useful even if you are the only one working on the source code, of if you are many.
But you should not make "one branch per developer":
the "isolation" purpose is made to isolate a development effort (a task which can be as general as "let's develop the next version of our software" or as specific as "let's fix bug 23"),
not to isolate a "resource".
(a branch called "VonC" means nothing to another developer: What if "VonC" leaves the project? What are you supposed to do with it?
a branch called "bugfix_212" can be interpreted in the context of a bug tracking system for instance, and any developer can use it with at least some idea about what he is supposed to do with it)
A branch is not a tag (SVN is a Revision System which tries to propose versioning features like branching and tagging through directories with cheap file copy: that does not mean a tag is a branch)
To define a branch means also defining a merge workflow: you need to know where to merge your branch when you are done with it.
For that, the chapter 7 of Practical Perforce (Laura WINGERD - O'Reilly) is a good introduction (VCS agnostic) to merge workflow between different kind of branches: "
"How Software Evolves" (pdf)
It defines the term codeline (branch which records significant evolution steps of the code, either through tags at certain points, or through important merge back to the branch)
It introduce the mainline model (a central codeline to record releases), and describes various purposes for branching:
- Active development streams: an persistent codeline when sequential various developments take place
- tasks branches: short-lived branches for more specific task (bug-fix is a classic one, but you can also define a branch for a merge effort you know to be complex to complete: you can merge, commit and test in that task branch without introducing problem for the main current development branch)
- staging branch: for preparing a release, with some pre-production specific data or config files.
- Private branches, ad hoc branches, and sparse branches: for very small tasks, just to be able to commit some work in progress without waiting for formal completion or test review.
That allows to "commit early, commit often".
Other interesting concepts around VCS: Basics concepts
(about ClearCase originally, but also valid for any VCS)
I find the advice from Laura Wingerd & Christopher Seiwald at Perforce is really concise and useful:
* Branch only when necessary.
* Don't copy when you mean to branch.
* Branch on incompatible policy.
* Branch late.
* Branch, instead of freeze.
See http://www.perforce.com/perforce/papers/bestpractices.html for a detailed explanation of each of them and other best practice.
Visual Studio TFS branch guidance. Concepts apply to any source control system.
All the 21th century SCMs are telling you:
Branch for every task you've to work on, no matter whether this is a new feature, a bugfix, a test, whatever. This is called topic branch, and it changes the way you work with your SCM.
You get:
- Better isolation
- Better traceability -> you associate tasks with branches, not individual changesets, which makes you free to commit as many times as you want and doesn't impose a limit like "one checkin per task".
- Tasks are independent (normally starting from a stable baseline, so you only focus on your code, not on fixing bugs from your folks), and you can choose whether you want to integrate them at some point or later, but they're always under version control
- You can review code easily (from the version control, not pre-commit bullshit) before hitting the main line
Tools that can do it:
Tools that CAN'T do it:
- SVN
- CVS
- VSS
- TFS
- Perforce