views:

315

answers:

12

I was just checking in my code and realized that I'm have no idea what criteria my code should ideally meet to perform a check in.

Yesterday I checked in after having it my project checked out for a week. Today I checked in again after making little updates. Sometimes I like to keep the code I'm working on checked out until it can get through a little bit of testing.

My boss usually gripes about people keeping code checked out over the weekend or vacations, so I try to avoid that (if for no other reason than because I forget what I'm working on).

Should I be concerned about hitting a critical mass of code or not being able to find bugs because there's too much code to search though?

A: 

You should only check in code which is working and doesn't break a build, you shouldn't try and check all code in together for a given feature creating a nice manageable change set (assuming your version control system supports change set concepts).

I'm not sure what you mean by diskspace or search-ability.

JoshBerke
I'd say keep the commit together, even when the SCM system doesn't support changesets. CVS doesn't support changesets but cvs2svn can extract changesets from a CVS by correlating commits that occured at the same time and have the same comments.
Joachim Sauer
Disk space, as in physical disk space from saving thousands of changes every day. I tried not to sound like an idiot writing the question so I didn't mention we use VSS with an upper limit of like 2gbThe ability to search is important when debugging, if you check in too much it'll be harder...
Peter Turner
It's ok I'm still stuck on VSS right now but not for to much longer...If your worried about the size of the DB you can do archiving to trim down the size. How large is your current data folder?
JoshBerke
1 gig even. I just checked, I guess we do archive after a few releases.
Peter Turner
+2  A: 

If your source code management system allows it, I prefer to make a private branch or activity, and do checkins to that at least daily, or more frequently if I'm going to start something that's risky. I only merge or deliver to make that accessible to others after the bit I'm working on actually works.

Paul Tomblin
+2  A: 

As much as is practical, I try to keep my check-ins as granular as possible, with the condition that I won't/don't check in anything that a) doesn't compile, b) breaks unit tests, or c) break backward compatibility (unless it's requested/designed that way).

If I'm concerned about leaving new or updated code on my machine for any length of time I typically create (or ask to have created) a temporary branch for my own use, but I only use that for as little time as I have to before I can rejoin the branch I was working in.

JMD
+9  A: 

I can recommend the following guidelines:

  • Each check in should be "atomic," meaning that you only change one part of the system. One large check in that changes a whole lot of things is not recommended, because then it becomes difficult to revert just one change.
  • Don't check in code until it has been thoroughly tested.
  • Do check in code as soon as it's ready. It's good for backup purposes and because your working copy gets out of sync with the latest changes in source control the longer it sits there.
  • Whether you check in frequently or infrequently, disk space should be roughly the same.
  • For searchability, just make sure your check in has a comment attached to it.
davogones
+7  A: 

Check in early and often, just like voting.

One caveat on that is it makes a little bit of a difference how your integration process works. If you have continuous integration, you don't want to check things back into the trunk unless you've run unit tests and feel confident it'll work.

The way to reconcile these two things is to create a branch for your workspace, check in often on the branch, and merge back when you've made and tested changes. The annoyances of making big merges over many changes will encourage you to break them down into small steps. Which is a Good Thing.

Charlie Martin
+2  A: 

If disk space is a concern, you really need bigger hard drives. Even in a redundant, secure environment, they're cheap enough that it's ridiculous to ever have to think about them.

As for how often you check in, it's often going to be a matter of personal policy. Within my own seven-coder team, I tend to check in once per job ticket while one coworker averages about six check-ins a day.

It's really a matter of what you're comfortable with. The only official policy is: Don't break the build and don't break working features. If you may have to hand over your work mid-stream, check in more often. This works for us.

Jekke
I edited that, I didn't mean physical disk space, so much as the maximum database size for the source control system.
Peter Turner
As the guy who also does admin on our devel server, I have to keep the size down to fit into one backup set. I don't want the pain of needing more. As a result disk space is limited to 100 GB even though the arrays have 400 GB. Tough! Clean out your home dirs more often. :-)
Zan Lynx
+2  A: 

I check in often.

Two things source control gives you that you lose from not checking in often are:

  • simplifying collaboration
  • tracking history of changes over time

Other developers need to be aware of changes that are happening in the codebase. Checking in your changes early notifies them of what you are working on. It can help against duplication of effort (either multiple people implementing the same thing of fixing the same bug). Infrequent checkins may lead to significant and unexpected changes to the code, which is much more likely to result in difficult debugging situations (less so because of the code you just checked in, but because of the complexities of it interacting with the rest of the system).

Your source control system can't help you revert changes if it doesn't know what changes have been made. Additionally, if you were to lose the local changes you've been working on for the past week, the damage would be much greater.

You shouldn't limit yourself because of disk space. SCM systems are pretty intelligent about their disk usage, but in any case storage is cheap and plentiful.

That said, I suggest you divide the work you have to do into small pieces, and check in after you've completed one such piece. It'll be easier on you and the other developers.

PCheese
+10  A: 

Check ins should be made often.

A check in should be

  • Atomic. Contain all needed changes but no more. Which also means that
  • whitespace changes go on their own commit.
  • change only one functionality. This might change several functions, but it shouldn't change functionality in several ways and / or places.
  • commented. A check in comment should enable you to quickly find a commit, even if it's 2 years old and you aren't working on that particular source (or even project) anymore.
  • working. Only commit if the code actually builds and does what it claims to do.

If you need your personal playground for committing consider a working branch or mirroring the repository with something like git svn. Remember to merge changes done in an atomic way as much as possible. This might be more work but it immediately pays off if you need to figure a commit that changed or possibly broke something.

bluebrother
+1  A: 

Question: what sort of VCS are you using? Your comment that your boss complained when you left code checked out over a vacation seems odd. I always have code checked out, using Subversion, and nobody cares.

If you use a VCS that requires reserved checkouts, or you are required to fake reserved checkouts, you're incapable of using best practices, and your organization should upgrade to Subversion or better.

This may well not be up to you, but if you are using an antique VCS, that could affect the answers being given.

David Thornley
Right, we're using Visual Source Safe 6.0 - that probably tainted the question quite a bit.
Peter Turner
A: 

This is the procedure we use:

  • A developer gets a task with a Design Document describing the new features or bug fixes.
  • That developer makes those changes and tests them.
  • The new code is peer reviewed by at least 2 other developers making sure the features and bugs described in the Design Doc have been implemented or fixed.
  • Any changes suggested by the code reviewers are made. If necessary, another review is held.
  • When the reviewers are satisfied, and the head developer gives the OK, the developer takes "control" of the repository (procedurally - no one else can check anything in until control is released), checks in the changes, checks out a new copy of the application and compiles and tests that it still builds and runs successfully.

That means that it may be weeks, sometimes months between commits for a developer, but this gives us a lot of protection against breaking the build and allows us to closely track the tasks and the changes associated with each one.

Scottie T
What version control system are you using? Doesn't it support branches?
Dave Sherohman
We use good ol' CVS.
Scottie T
+1  A: 

I agree with most of what's been said here, but I'd add the obvious point that it's less important to do frequent check-ins to the central repo if you're using a distributed vcs like bzr or git. In this case you can also tilt a little more towards the untested commits as long as you're just updating your local repository. It's convenient if you're breaking for the day to just go ahead and commit even if you haven't tested yet because you don't have to worry about any problems affecting anyone else.

Brandon Thomson
+1  A: 

How frequently you check source code in is really a function of few things, most notably, culture and process. You would want to observe good "hygiene" by checking in frequently so as to minimise risk of data loss, but there are other benefits to this practice which may apply more or less depending on your methodology:

  • In an agile environment, regular check-ins lower risk of the build breaking or of encountering merge conflicts with other programmers;

  • More frequent check-ins imply more granular check-ins which means you're gathering alot more information---this is useful for getting stats on project health and for hunting down bugs;

  • Granular check-ins mean that check-in comments are more focused, and there's less chance of under-describing changes that were made;

  • Granular check-ins ensure that inter-version diffs become easier to navigate.

Your specific question though is about check-in criteria---this of course is a matter of style. As a rule of thumb, I aim for breaking a task down into 1 to 3 hour sub-tasks, each of which has a specific goal---check in once you've finished a sub-task. "Finished" is subjective, but in my book it means working i.e., all unit tests pass, and your code coverage is at an acceptable level (acceptable to you, that is).

Eric Smith