views:

153

answers:

8

We use ClearCase at my workplace. Part of our standard process when code is merged to the main (trunk) branch is to completely eradicate all of the versions on development and integration branches. Because this wipes out all of the check-in comments that went along with these versions, our source files have to have a lengthy prologue comment that identifies each change.

On a few occasions I have pointed out that this negates one of the fundamental reasons for using a version control system, and stated that by removing versions it becomes impossible to see who originally worked on something, when problems got introduced, etc. People checking in new versions have learned not to bother entering a check-in comment because it's just going to be deleted anyway.

The justification I have heard for removing the old versions has usually just come down to "feel-good" reasons. My more experienced coworkers feel that removing these old branches makes the version trees for files "cleaner". They claim that there is no reason to keep these old versions around once it's been merged to our trunk. They're also concerned that other developers will accidentally keep these outdated branches in their view config specs. Finally, they argue that removing these branches saves disk space on the CM server.

Am I right to have a bad feeling about this, or are there other development shops out there who operate successfully in this way? If you also think this is a bad idea, what other arguments in favor of keeping old versions would you supply? If you have operated successfully with this kind of process, what sort of benefits have you observed?


Edited to clarify: Previous versions of the trunk are always preserved. It's the branches where the stuff originally got created or modified that are removed.

+1  A: 

Removing versions is incomprehensible to me. It sounds like your coworkers are trying to idiot-proof the use of ClearCase rather than provide proper documentation, support, and training on its capabilities.

Unfortunately though, I too have encountered very similar situations. When starting future projects, you should try to make clean and clear arguments for how you believe version control should be done. Maybe if the process the project starts with is established properly, they will all see the advantages and apply them to future projects.

Tim Bender
+3  A: 

It doesn't make sense to me why you would use version control if you aren't going to keep the versions.

The main benefit of version control, in my opinion, is the ability to go back in time. I find myself constantly checking previous versions of files to figure out why or how something was changed.

This has come in especially handy as requirements evolve and you find you really did need that code you wrote three months ago.

Kit Menke
+1  A: 

There are no benefits to removing revision info. Even revision info with no checkin comments is 1000x better than no revision info.

The most compelling case for keeping revision info (beyond file recovery) is that when you find a bug and trackback to the checkin where the bug was introduced, it is usually a good idea to look around for checkins by the same user around the same time. The bug might be more extensive than it first appears. Can't do that without revision info.

Change jobs. You'll live longer. Working with programmers who don't understand the benefits of version control cannot be good for your ongoing health.

jmucchiello
+2  A: 

You've already noticed one big problem: with the removal of commit comments, there's no automatic record of how the code got to be the way it is. There's also no way to examine the history of how the software got to be the way it was: it's one big lump, with large prologue comments that may or may not be accurate.

Frequently when I come across code that looks odd, I want to know how it came to be. Since we use Subversion, I can use "svn blame" to find what revision the line appeared in, and check it from there. This will usually lead to understanding the purpose of the code, and give me a clue what I might break by changing it. It's also often useful to find when a feature was added or removed.

While this may save some space, a good VCS will store deltas and therefore won't take up all that much extra space (note: I don't know if ClearCase is good in this way). In the meantime, the files you're using are swollen by the prologue comments and likely by code that's commented out or conditionally compiled in case it's going to be useful later on.

As somebody who used to administer a VCS system, there's only two reasons to delete something out of the system. One is if something got committed that shouldn't be, and is causing problems (it may be very large, for example - some people have had problems when somebody committed not only the source files but all the binaries), and the other is if it's inappropriate (such as confidential information).

David Thornley
+1  A: 

No, I don't think the benefits outweigh the costs. The costs are obvious (and covered well by existing answers) so I'll address the so-called benefits.

  1. If you want a "cleaner" source tree, there are plenty of other features that don't involve destroying information. Most source control systems can place items in a deleted state that's hidden from standard UIs (unless you turn on special options); enforce Read permissions on a per-item basis; move the items to a predefined area of the tree (eg someplace called "Old Stuff"); or all of the above.

  2. To my knowledge, every SCC system that's powerful enough to support custom view specs also allows administrators to audit and overwrite these settings.

  3. Source code simply isn't that big. Especially after you consider compression + delta storage. Only in a few niche applications where large binary files are involved might I consider permanent removal. (E.g., game development studios go thru a TON of high-resolution artwork and video.) Even so, my retention policy would not be so brash as the one you describe. I'd keep snapshots at predefined intervals instead. Say: all revisions for 2 weeks, then daily for the previous 2 weeks, weekly for a few months prior to that, then monthly back to the beginning.

Bottom line, developer time is really frackin' expensive. A few minutes of CC administration + a few extra hard drives in the SAN don't even compare.

Richard Berg
+4  A: 

If there is one thing you do not do with ClearCase is removing version.
Never. (Almost never anyway).

ClearCase is heavily based on delta, its baselines can be set in incremental mode and will need previous versions to get their content right, and more generally, the history of files is what it is about. (and history of merges: rmver will remove all xhlinks, i.e. all hyperlinks like merge informations)

If you are really about cleaning the history: cleartool lock -obsolete is the right answer.
Just obsolete the old branches from the trunk you do not want to see anymore. It is usually more than enough.

The only cases where removing version could be justified would be:

  • new versions with "wrong content" you did not meant to version at all: if it is the last one, with no label or hyperlinks, you could remove it rather than change it...
  • larges binaries files evolving really often (then rmver for old versions could actually save place). All other kind of files will not save significant place by having their old versions removed.
VonC
Fortunately, because of the way ClearCase merges work, removing the "source" version of a merge doesn't mangle the versions following the destination version of a merge.
Mike Daniels
@Mike: In Base CC, you are correct (but you will no longer be able to tell *why* those particular lines suddenly appear in your main branch). With UCM... that is another story (but the OP does not use UCM)
VonC
+1  A: 

Being able to Blame someone for a specific change is often very useful. Once all the changes are in the trunk all of the information on who wrote what and why he did that is lost. Being able to show someone that "Here - You did this, there" is often very useful in on itself, even without the commit comments.

shoosh
A: 

Without a version control history a very useful "debugging" tool, of searching the history for bug (by bisection) to find first revision that introduced bug, is impossible. When you find a commit that introduced bug (some VCS provide "scm bisect" command to help with this, even up to complete automation if you have scripted test), and you followed good version control practices of small, single issue, well described commits, then it is very easy to find where the bug is.

Jakub Narębski