views:

577

answers:

4

Hi everyone! This is one of the headers I found in WxWidgets and I like it. I wonder if there is a way to insert in all my source files a header like this and maintain it automatically updated? It includes two properties of SVN which I am aware of.

/////////////////////////////////////////////////////////////////////////////
// Name:        <filename>.cpp
// Purpose:     
// Author:      <AuthorName>
// Modified by:
// Created:     $Date$
// RCS-ID:      $Id$
// Copyright:   (c) <Year> <AuthorName>
// Licence:     <licensetype>
/////////////////////////////////////////////////////////////////////////////
+2  A: 

One option would be to put a pre-commit hook in your Subversion server that checks that the header exists. There's nothing you can really do to make sure it's kept up to date though, beyond self and team discipline; you could check some of them automatically (e.g. Created, Modified by, etc), but you could use Subversion properties for those in the first place, and the rest are judgment calls. How could you automatically update the file's purpose?

Generally speaking though, I'm not very fond of this sort of thing. You already know the file's name (duh), author, modifier, creation date, etc: just ask Subversion. Putting all this at the top of the file wastes space at best, and can be wrong at worst. The file's purpose tends to be useful, but you have to be sure to keep it updated, which is more a question of coding style than anything.

C Pirate
I hold to the same philosophy, however there is a good argument for such comments in commercial code (ie, where the version control system isn't necessarily used by everyone, regardless of what best practices might be), or when code is shared by email, then it might be useful to have the comments in the file.
Arafangion
Another reason I dislike version control updated fields is that it can create spurious differences - i.e. the only difference is the date field. Plays havoc with trying to find real differences in large file sets. Not worth it for things with limited value to begin with.
Steve Fallows
+1  A: 

It depends in part on which VCS you use. For my own work, I used SCCS up until 1999, but switched to RCS to avoid problems with Y2K (the SCCS date format uses 2 digits for the year, which I find unacceptable). As a result, I have a strong view on how to make reasonably sane use of those systems. Somewhere in SO I have already discussed what goes in my file headers, but it is simpler to find an illustration than that other answer...

/*
@(#)File:           $RCSfile: stderr.c,v $
@(#)Version:        $Revision: 9.14 $
@(#)Last changed:   $Date: 2009/07/17 19:00:58 $
@(#)Purpose:        Error reporting routines
@(#)Author:         J Leffler
@(#)Copyright:      (C) JLSS 1988-91,1996-99,2001,2003,2005-09
@(#)Product:        :PRODUCT:
*/

This is one of my oldest source files - migrated from SCCS to RCS. The VCS (that is, RCS) automatically maintains the values of the $RCSfile$, $Revision$, and $Date$ values. I have a shell script that drives a Perl script to maintain the Copyright dates; I need to remember to use it the first time I edit the file in any given year. I have not bothered, yet, to make a filter script that just hacks the copyright line (which is moderately unusual for me - I make lots of scripts). That file is in its "undistributed" format; when I distribute it with a product, the ':PRODUCT:' meta-keyword is expanded to name the relevant product (by my release building software). Clearly, neither my name nor the purpose of the file needs much maintenance. (As an aside, I still prefer the SCCS way of managing keywords - the SCCS equivalents of $RCSfile$ etc.)

Where the version control system does not intrinsically support the keywords, it is much harder to decide how to handle such information. The first rule is "don't fight against your VCS". War story - we tried fighting the VCS and it didn't work. Once upon a long time ago (a decade and a half ago), the company switched from SCCS to Atria Clearcase (now IBM Rational ClearCase). ClearCase does not support the embedding of version information into source files. It does support checkin triggers. We wrote and deployed a trigger to ensure that the ClearCase version numbers were embedded in the files like the SCCS version numbers had been before. The checkin trigger worked fine; we could look at the file, inside or outside the view, and see which version it belonged to. But the changes in the version numbers broke the merging code - all merges became manual merges, even if the only conflict was in the version number. This was because we were fighting the VCS and it wasn't willing to let us win. So, we ended up abandoning the checkin trigger.

I am still trying to work out how to handle version stamping source files with a modern DVCS such as git. It looks like I'm going to have to rework my entire release system - probably as a hybrid that covers both SCCS and RCS (as now, though the SCCS part hasn't been used for most of a decade) as well as git.

One theory, used by many people, is that you should avoid building metadata into source files. I remain to be wholly convinced that this is good - I think it is helpful to see the origin of a source file even when it is divorced from its original context (has been renamed, removed from its original package, modified, and included in some new product). I may yet have to live with this viewpoint when using a DVCS.

My theory, used by me but not necessarily by anyone else, is that metadata should be in files because they are not always used in their original context and the metadata can survive and help identify its origins, even decades later. So, when I'm building a source code release, I use my release software to automatically edit in the product information, using the :PRODUCT: etc notations to mark what should be edited. You can see this at work if you download any of the packages I've contributed to the IIUG (International Informix Users Group) web site. I'd recommend SQLCMD as probably the biggest and most recent package - though it has been available there since the mid-90s and version 23 or so (currently on version 86.00).

One of the biggest problems I face with git is that almost all my programs use the code in stderr.c and stderr.h. However, it is not yet clear to me how I go about incorporating the same code in each of the many products that use it, without going in for multiple maintenance of it. This is far from being the only pair of source files which I use unchanged in many different products. But I don't want to build the entire library with each product - the library would be bigger than many of the products, and any given product only uses a few of the files from the library. ...Ah well, one day, enlightenment will come...

I disagree with the comments that the name of the file is not metadata worth keeping in the file. I think it is worth keeping - because the name can change when the contents don't, and it is easier to see where it came from if the metadata is there. Of course, the malicious can tamper with (or remove) the metadata - but they often don't.

Jonathan Leffler
+1  A: 

I believe nearly all of these fields are completely redundant, and add no value whatsoever to the descriptiveness of the file:

  1. The filename is possibly useful only during recovery from a disk crash? The preprocessor automatically includes it when it's useful (if you're trying to find a nasty header bug, for example).

  2. The purpose of a .cpp file will almost always be to contain the implementation of one or more .h files containing one or more class declarations. A comment describing the purpose of each class is better located in the header file just prior to the class declaration, where it is also more likely to be updated should the class's interface change.

  3. The author field will never be accurate, as soon as a single line change by a third party is made; use a revision control system's log / annotate / blame commands to track this in a reliable manner instead.

  4. "Modified by", "Created", and "RCS-ID" are similarly useless and prone to falling out of date. An RCS-ID may only identify the version of a file at last commit, it cannot account for any unversioned changes that have been made since. Instead, if you care about a file's exact version, use something robust like an MD5 sum instead.

Which leaves the copyright notice and potential licence declaration, which in some jurisdictions are supposedly required:

/* Copyright 2009, Your Company Name. All right reserved. */

Inserting this can easily be achieved with an editor macro.

David Wilson
I doubt that informations like the initial author are worthless. I have even seen source files that have a well maintained history of changes/authors at the beginning, and that is very worthwhile. By my experience hardly ever a developer inspects the history in a VCS. And there are still not so few projects without any vcs.
RED SOFT ADAIR
A: 

Thanks everyone for the responses. It was very helpful for me! Made me to ponder on what kind of information and how I want to include it in my source files.

Andrew