views:

199

answers:

2

We are currently planning a quality improvement exercise and i would like to target the most commonly edited files in our clearcase vobs. Since we have just been through a bug fixing phase the most commonly edited files should give a good indication of where the most bug prone code is, and therefore the most in need of quality improvment.

Does anyone know if there is a way of obtaining a top 100 list of most edited files? Preferably this would cover edits that are happening on multiple branches.

+1  A: 

(See answer for more complicated case: multiple branches)

First, use a dynamic view: easier and quicker to update its content and fiddle with its config spec rules.

If your bug-fixing has been made in a branch, starting from a given label, set-up a dynamic view with the following config spec as:

element * .../MY_BRANCH/LATEST
element * MY_STARTING_LABEL
element * /main/LATEST

Then you find all files, with their current version number (closely related to the number of edits)

ct find . -type f -exec "cleartool desc -fmt """%Ln\t\t%En\n""" """%CLEARCASE_PN%""""|sort /R|head -100

This is the Windows syntax (nothe the triple "double-quotes" around %CLEARCASE_PN% in order to accommodate spaces within the file names.

the 'head' command comes from the GnuWin32 library.
The most edited version are at the top of the list.

A Unix version would be:

$ ct find . -type f -exec 'cleartool desc -fmt "%Ln\t\t%En\n" "$CLEARCASE_PN"' | sort -rn | head -100

The most edited version would be at the top.

Do not forget that for metrics, the raw numbers are not enough, trends are important too.

VonC
I really like your answer but as with most projects dev has not all happened on the one branch so the version numbers don't necessarily mean most edited. Is there a way to get number of check-ins across all branches?
mR_fr0g
+1  A: 

(The previous answer was for a simpler case: single branch)

Since "most projects dev has not all happened on the one branch so the version numbers don't necessarily mean most edited", a "way to get number of check-ins across all branches" would be:

  • search all versions created since the date of the last bug fixing phase,
  • sort them by file,
  • then by occurrence.

Something along the lines of:

C:\Prog\cc\test\test>ct find -all -type f -ver "created_since(16-Oct-2009)" -exec "cleartool descr -fmt """%En~%Sn\n""""""%CLEARCASE_XPN%"""" | grep -v "\\0" | awk -F ~ "{print $1}" | sort | uniq -c | sort /R | head -100

Or, for Unix syntax:

$ ct find -all -type f -ver 'created_since(16-Oct-2009)' -exec 'cleartool descr -fmt "%En~%Sn\n" "%CLEARCASE_XPN%"' | grep -v "/0"  | awk -F ~ '{print $1}' | sort | uniq -c | sort -rn | head -100
  • replace the date by the one of the label marking the start of your bug-fixing phase
  • Again, note the double-quotes around the '%CLEARCASE_XPN%' to accommodate spaces within file names.
  • Here, '%CLEARCASE_XPN%' is used rather than '%CLEARCASE_PN%' because we need every versions.
  • grep -v "/0" is here to exclude version 0 (/main/0, /main/myBranch/0, ...)
  • awk -F ~ "{print $1}" is used to only print the first part of each line:
    C:\Prog\cc\test\test\a.txt~\main\mybranch\2 becomes C:\Prog\cc\test\test\a.txt
  • From there, the counting and sorting can begin:
    • sort to make sure every identical line is grouped
    • uniq -c to remove duplicate lines and precede each remaining line with a count of said duplicates
    • sort -rn (or sort /R for Windows) for having the most edited files at the top
    • head -100 for keeping only the 100 most edited files.

Again, GnuWin32 will come in handy for the Windows version of the one-liner.

VonC