tags:

views:

631

answers:

5

Hi,

We have a quite large (280 binaries) software project under Linux and currently it has a very dispersed code structure - that means one can't [work out] what code from the source tree is valid (builds to deployable binaries) and what is deprecated. But the Makefiles are good. We need to calculate C/C++ SLOC for entire project.

Here's a question - can I find out SLOC GCC has compiled? Or maybe I can gain this information from binary (debug info probably)? Or maybe I can find out what source files was the binary compiled from and use this info to calculate SLOC?

Thanks Bogdan

+1  A: 

It depends on what you mean by SLOC that GCC has compiled. If you mean, track the source files from your project that GCC used, then you'd probably use the dependency tracking options which lists source files and headers. That's -M and various related options. Beware of including system-provided headers. A technique I sometimes use is to replace the standard C compiler with an appropriate variation - for example, to ensure a 64-bit compilation, I use 'CC="gcc -m64"' to guarantee the when the C compiler is used, it will compile in 64-bit mode. Obviously, with a list of files, you can use wc to calculate the number of lines. You use 'sort -u' to eliminate duplicated headers.

One obvious gotcha is if you find that everything is included with relative path names - then you have to work out more carefully where each file is.

If you have some other definition of SLOC, then you will need to specify what you have in mind. Sometimes, people are looking for non-blank, non-comment SLOC, for example - but you still need the list of source files, which I think the -M options will help you determine.

Jonathan Leffler
Hi, I thired this approach on small "hello world" and it seems to be working. It produces a lot of iostream-included headers but they all could be cut off by /usr prefix. Though it's hard to apply this approach on the project-I would have to patch whole bunch of makefiles,this is good advice. Thanks
Bogdan
@Bogdan - maybe one of the changes needed is to allow for centralized changes to the makefiles, for example, by using 'include ${TOPDIR}/config/configuration.mk', a makefile that defines project-wide options, etc. My makefiles usually reserve UFLAGS for user options on the command line for CFLAGS.
Jonathan Leffler
+1  A: 

What you can do is do a pre-processor only compilation, using gcc's -E flag: this will result in output that is the actual code being compiled. Do a simple line count (wc -l) or something more advanced.

It might include extra code from macro's, etc. but especially if you compare it with a previous instance of your code it is a good comparison.

Roalt
I tried this approach on small "hello world" .cpp. After executing gcc -E a.cpp | wc -l I got 29494 lines. Most of them are iostream internals and it would be a peace of work to get rid of this code. Thank you for your answer. I'm sure it would be useful if you use it with comparing(diff) software.
Bogdan
+1  A: 

The first thing you want is an accurate list of what you actually compiled. You can achieve this by using a wrapper script instead of gcc.

The second list you want is the list of files that were used for this. For this, consult the dependency list (as you said that was correct). (Seems you'd need make --print-data-base)

Then, sort and deduplicate the list of files, and throw out system headers. For each remaining file, determine the SLOC count using your prefered tool.

MSalters
Hi, approach with substituting orinial gcc with fake one to get what files are compiled seems to be working the best. Even though I have to manage .h files included on my own, it takes the least time to implement is the most efficient approach. I'm going to use it to get precise metric. Thanks
Bogdan
A: 

Hi, I've used the following approach to get dirty metric value in 2 hours. Even though the preciseness was far from ideal it was enough to make the decision.

We took around 40 kb of code and calculated SLOC for this code using gcov. Then we calculated "source lines per byte" metric and used it to get approximate SLOC number using C source code size for the whole project.

It worked out just fine for our needs.

Thanks

Bogdan
A: 

You may want to try Resource Standard Metrics as it calculates effective lines of code which exclude the standalone braces etc which are programmer style and artificially inflate SLOC counts by 10 to 33%. Ask them for a free timed license to give it a try.

Their web page is http://msquaredtechnologies.com