Alright, this is what I came up with:
#!/bin/bash
set -e
for file in `svn ls -R`; do
if [ -f $file ]; then
owner=`svn blame $file | tr -s " " " " | cut -d" " -f3 | sort | uniq -c | sort -nr | head -1 | tr -s " " " " | cut -d" " -f3`
if [ $owner ]; then
echo $file $owner
fi
fi
done
It uses svn ls
to determine each file in the repository, then for each file, svn blame
output is examined:
tr -s " " " "
squeezes multiple spaces into one space
cut -d" " -f3
gets the third space-delimited field, which is the username
sort
sorts the output so all lines last edited by one user are together
uniq -c
gets all unique lines and outputs the count of how many times each line appeared
sort -nr
sorts numerically, in reverse order (so that the username that appeared most is sorted first)
head -1
returns the first line
tr -s " " " " | cut -d" " -f3
same as before, squeezes spaces and returns the third fieldname which is user.
It'll take a while to run but at the end you'll have a list of <filename> <most prevalent author>
Caveats:
- Error checking is not done to make sure the script is called from within an SVN working copy
- If called from deeper than the root of a WC, only files at that level and deeper will be considered
- As mentioned in the comments, you might want to take revision date into account (if the majority of checkins happened 10 years ago, you might want to discount them determining the owner)
- Any working copy changes that aren't checked in won't be taken into effect