views:

292

answers:

2

Hi,

I'm working on a small academic research about extremely long and complicated functions in the Linux kernel. I'm trying to figure out if there is a good reason to write 600 or 800 lines-long functions.

For that purpose, I would like to find a tool that can extract a function from a .c file, so I can run some automated tests on the function.

For example, If I have the function cifs_parse_mount_options() within the file connect.c, I'm seeking a solution that would roughly work like:

extract /fs/cifs/connect.c cifs_parse_mount_options

and return the 523 lines of code(!) of the function, from the opening braces to the closing braces.

Of course, any way of manipulating existing software packages like gcc to do that, would be most helpful too.

Thanks,

Udi

EDIT : The answers to Regex to pull out C function prototype declarations? convinced me that matching function declaration by regex is far from trivial.

+2  A: 

Why don't you write a small PERL/PHP/Python script or even a small C++,Java or C# program that does that?

I don't know of any already-made tools to do that but writing the code to parse out the text file and extract a function body from a C++ code file should not take more than 20 lines of code.. The only difficult part will be locating the beginning of the function and that should be a relatively simple task using RegEx. After that, all you need is to iterate through the rest of the file keeping track of opening and closing curly braces and when you reach the function body closing brace you're done.

Miky Dinescu
That was my first idea, but comments make things a little more complicated.However, If I am to write such a script. I guess I'll publish it - it might be useful to others.
Adam Matan
I agree.. you probably need to take comments into consideration just in case they might contain non-matching curly braces inside. Still, it shouldn't be too difficult to code!
Miky Dinescu
I have to strip them out, get the location of the end of the function, and then extract the code with the comments. Another issue is the function signature on top - it can be more than one line long.As I mentioned before, it's not a big deal - but I'd rather focus on my research and use a ready-made tool. If that won't be possible, I'll Python the problem!
Adam Matan
Seriously, writing this script should take less than an hour. You have probably already wasted more time looking for a tool to do it than doing it yourself.
muusbolla
You should run the preprocessor (e.g. by `gcc -E`) before you parse the source. After preprocessing it should be easy.
frunsi
A: 

indent -kr code -o code.out

awk -f split.awk code.out

you have to adapt a little bit split.awk wich is somewhat specific to my code and refactoring needs (for example y have so struct who are not typedefs

And I'm sure you can make a nicer script :-)

--
BEGIN   { line=0; FS="";
    out=ARGV[ARGC-1]  ".out";
    var=ARGV[ARGC-1]  ".var";
    ext=ARGV[ARGC-1]  ".ext";
    def=ARGV[ARGC-1]  ".def";
    inc=ARGV[ARGC-1]  ".inc";
    typ=ARGV[ARGC-1]  ".typ";
    system ( rm " " -f " " out " " var " " ext " " def " " inc " " typ );
    }
/^[     ]*\/\/.*/   { print "comment :" $0 "\n"; print $0 >> out ; next ;}
/^#define.*/        { print "define :" $0 ; print $0 >>def ; next;}
/^#include.*/       { print "define :" $0 ; print $0 >>inc ; next;}
/^typedef.*{$/      { print "typedef var :" $0 "\n"; decl="typedef";print $0 >> typ;infile="typ";next;}
/^extern.*$/        { print "extern :" $0 "\n"; print $0 >> ext;infile="ext";next;}
/^[^    }].*{$/     { print "init var :" $0 "\n";decl="var";print $0 >> var; infile="vars";
                print $0;
                fout=gensub("^([^    \\*])*[    ]*([a-zA-A0-9_]*)\\[.*","\\2","g") ".vars";
                     print "var decl : " $0 "in file " fout;
                     print $0 >fout;
                next;
                        }
/^[^    }].*)$/     { print "func  :" $0 "\n";decl="func"; infile="func";
                print $0;
                fout=gensub("^.*[    \\*]([a-zA-A0-9_]*)[   ]*\\(.*","\\1","g") ".func";
                     print "function : " $0 "in file " fout;
                     print $0 >fout;
                next;
            }
/^}[    ]*$/        { print "end of " decl ":" $0 "\n"; 
                if(infile=="typ") {
                    print $0 >> typ;
                }else if (infile=="ext"){
                    print $0 >> ext;
                }else if (infile=="var") {
                    print $0 >> var;
                }else if ((infile=="func")||(infile=="vars")) {
                    print $0 >> fout; 
                    fflush (fout);
                    close (fout);
                }else if (infile=="def") {
                    print $0 >> def;
                }else if (infile=="inc"){
                    print $0 >> inc;
                }else print $0 >> out;
                next;
            }
/^[a-zA-Z_]/        { print "extern :" $0 "\n"; print $0 >> var;infile="var";next;}
            { print "other :" $0 "\n" ; 
                if(infile=="typ") {
                    print $0 >> typ;
                }else if (infile=="ext"){
                    print $0 >> ext;
                }else if (infile=="var") {
                    print $0 >> var;
                }else if ((infile=="func")||(infile=="vars")){
                    print $0 >> fout;
                }else if (infile=="def") {
                    print $0 >> def;
                }else if (infile=="inc"){
                    print $0 >> inc;
                }else print $0 >> out;
               next;
               }
Patrick Sinz