views:

451

answers:

12

Hi All,

I'm just doing some formal training in Agile at the moment and one question I have is about the value of Continuous Builds vs value of committing to the version control system often.

My understanding with version control is that its better to commit often, because then you have history and the ability to go back to previous changes in a fine grained way.

My understanding with Agile and continuous build is that its there to put pressure on the developers to always have working code. That to break the source tree is a taboo thing to do.

Now i agree with both of these sentiments, but it occurs to be that sometimes these might be working against each other. You maybe in the middle of a largish code change and want to commit code to make sure you have history, but this will break the source tree.

Anybody got any thoughts on this?

Cheers

Mark.

+10  A: 

Branches/tags resolve this, in most source control systems.

They let you mark or just 'branch' (pun intended) a segment/revision of code and have that as the 'stable release'. You can then commit changes to the main trunk, or a 'patch' branch, or other approaches.

The two concepts work together.

Noon Silk
+2  A: 

Use Git for easy branching, merging, and rebasing.

SquareCog
+2  A: 

Silky is spot on, branching/tagging resolves this (svn plug for this functionality).

I am a big fan of commit often, and I personally find it makes it easier to prevent breaking the build, because i am unit testing a smaller amount of code each time.

Russell
A: 

For a significant or large change that is likely to break dependent pieces of code a branch would be appropriate. At the point where you want to integrate this change and check into the trunk or whatever integration branch you are going to promote it to, having addressed the breakages and having the tests all working is essential.
I don't think the two things should be working against each other. Use of branches or distributed source control would make this easier to manage.

Hamish Smith
+3  A: 

Actually a common Agile philosophy (that I've actually been pretty happy with) is along the lines of "If you can't commit before going home, revert.

At first this sounds brutal, so I usually copy off my source tree locally first or branch it, and then I revert back to where I started. The next day I start the work over. It usually goes VERY FAST, I improve over what I did the previous day and I rarely if ever look at the copy (Well, sometimes I'll pull back some classes that I had "Completed" and felt sure of and re-integrate them.

I hardly ever have the need to go more than a few hours without checking in. I try using refactors (they are ALWAYS very short and harmless or they aren't refactors) and I add code in such a way that things don't break. This might involve adding tested code (a new method or object) and checking it in before linking in the rest of the code.

Overall, your unit tests should ALWAYS run. I tend to run tests as often as a few times a minute and rarely more than once every ten minutes.

Taking the small steps may take a little longer, but you will avoid those 3-4 day code rewriting sessions where you can't run any tests or check in, those can be brutal and a HUGE waste of time!

Bill K
+6  A: 

What could be less Agile than a taboo about something ever going wrong? I would argue that the taboo is to leave the build broken rather than to break the build at all. The occasional build breakage is ok. This is exactly why you are running continuous builds and tests. The CI build/test identifies when the build is broken, and ideally who broke it. This ensures that it's fixed quickly. If this happens occasionally, you're ok. If it happens twenty times a day, the team is probably in trouble.

The taboo is to interfere with other people getting their work done. When you break the build, they get an email saying, "our branch of source is busted". They won't be able to integrate other people's changes, or their changes with the mainline until they get the all clear email.

The real challenge in working in this sort of continuously integrating environments are: 1) Keeping the teams pretty small. Generally we start seeing trouble after about 25 developers are on the team. Things start getting fragile. Using team level branches, components or multi-stage CI with streams can help larger teams break into smaller teams.

2) Choosing small units of work. There generally shouldn't be a conflict between checking in improvements regularly and not breaking everything. Commits should be done when small, working changes are made. The new feature might not be exposed to the user yet, but if a coherent API change is made that doesn't break tests, check in.

3) Fast, accurate Builds. There are a lot of race conditions that the team tends to win more often when the build gets faster. Plus reproducible builds will ensure that the build the developer does on her own machine (which she had time to do because it was fast) reasonably accurately predicts success on commit.

EricMinick
Thanks for those insights Eric. Although I think Silky answered my question directly. Your answer answered other questions around this that i hadn't asked. Hence the upvote.
Mark Underwood
A: 

You'll have to examine what kind of history makes sense during a merge. Lets say you have a program that uses lots of loadable modules, could be a kernel .. a web server, whatever.

In writing one module, you make 200 commits, when you merge with the main project, you probably want only one (albeit big) patch, perhaps two:

  • Introduce foo module
  • Update Makefiles to build foo

This is one of the reasons why Git has become such a dominate presence in the world of DVCS.

Your choice of commit frequency really has no bearing on what method of software development you want to employ. You can commit 200 well tested revisions, or one, as long as people pulling what you push don't ingest toxic revisions or regressions in their code caused by your own (unless, of course your code exposes a problem in theirs).

I (personally) like to make many small commits, for the same reasons that you gave. In fact, its usually ideal if everyone is working on one central branch. However, if your working on some sub system for 6 months, I really would prefer you send me a few large patches, rather than inherit your whole history .. you always have your history in your own working repo, and its probably only interesting for you :)

Tim Post
+2  A: 

I'll add yet another answer, because to me it seems some of the most important points haven't been mentioned.

My understanding with version control is that its better to commit often, because then you have history and the ability to go back to previous changes in a fine grained way.

I absolutely agree about this.

My understanding with Agile and continuous build is that its there to put pressure on the developers to always have working code.

It is not there to put pressure on developers — I'd rather describe continuous integration as a friendly safety net that helps you catch problems as soon as you commit them, when fixing them is usually easy. (Check Martin Fowler's seminal article for more CI benefits.) It is important to always have working code, and that's where version control branches come in, as silky pointed out. But unlike the traditional scenario he describes (and what Fowler talks about: "Everyone Commits To the Mainline Every Day"), I'd recommend the opposite: have your main trunk stable, preferably always in releaseable shape, and do all major development in temporary working branches.

I've plugged the stable trunk approach on SO here and here; see those posts for some justification and experiences of this model. Also, I warmly recommend this article which influenced my thinking a lot: Version Control for Multiple Agile Teams by Henrik Kniberg.

Breaking the build in a dev branch is far from a taboo, although you should still try to keep everything compiling and all tests passing. Breaking the trunk build, in this scenario, is somewhat more serious, but still I wouldn't call it a taboo — these things happen from time to time, and instead of finding someone to blame, it is infinitely more important for the team to just fix it (and be happy that the problem was found now, instead of much later, perhaps by a customer).

Jonik
+1 - Great post. Good link on the VC for Multiple Agile teams too.
Mat Nadrofsky
A: 

We do both here. We use Continuous Integration and check in multiple times per day.

  • If I'm writing something large I need to divide my work into manageable chunks to make sure the build does not fail. This leads to me thinking about how to set up my code which in turn leads to better factored code.
  • If I'm solving issues I do a checkin after each resolved issue, even if this means there will be a new build every hour (If you have a huge build, which takes more than a couple of minutes to run, this might not be the best solution).
  • When refactoring I also try to check in multiple times; if I'm doing it right, the code should only be broken for a couple of minutes on my machine.

In short, I don't see these things as opposites:

Dividing your work into smaller chunks so you can check in often leads to cleaner code I would say.

Ruben Steins
A: 

If the largish code change is in a separate branch this doesn't break the build necessarily. By having a branch onto itself, the changes are kept out of the code until the change is done and then the whole thing can be merged back into a trunk or main code line. The key is that while there is a continuous build happening, it isn't necessarily going to include things that aren't "done-done."

JB King
A: 

It seems to me that Git solves this problem. Keep a local repository and commit early, commit often and then when the code reaches a non broken milestone push out to the main shared repository. If the whole team uses Git then the all entire repository history can be maintained in everybody else's repository when then they pull changes. And all without necessarily breaking the build.

And with rebasing you don't even have to expose your entire local commit history when you push out milestones.

Gordon Potter
+1  A: 

TDD Lets You Have Both

One resolution to this apparent paradox is the agile software development practice of Test Driven Design (TDD). Practiced well, it is easy to commit code often and have continuous builds that are rarely broken by non-working code.

First get the latest code from the repository and run all the tests. (If they don't all pass, bust the last person to commit.) Write a test for a small piece of functionality (before you've implemented the functionality), then implement the functionality, and run all the tests again. Update your code from the version control system in case anything has changed while you were working, run all the tests again, and if they all pass, you could commit right then. That much is "Red-Green" of the agile concept Red-Green-Refactor. After that, do any needed refactoring and run all the tests again. If you're still green, you could commit again at that point.

Most agile teams have a Continuous Integration server that runs on a regular schedule (often hourly or more) and has a large visible indicator (like a traffic light) that shows whether the most recent build has passed, failed or is in process.

Have a Local Version Control Database if You Have To

If you absolutely can't get away from having "largish code changes", then use your own local version control repository, using something like git as Gordon Potter suggests, and commit when you're done with your change. You can do this even if your team uses some other version control product.

JeffH