views:

292

answers:

4

I want a python script to print list of all functions defined in a C/C++ file.

e.g. abc.c defines two functions as:

void func1() { }
int func2(int i) { printf("%d", i); return 1; }

I just want to search the file (abc.c) and print all the functions defined in it (function names only). In the example above, I would like to print func1, func2 using python script.

+2  A: 

antlr is your tool

Tzury Bar Yochay
+3  A: 

I would suggest using the PLY lex/yacc tool. There's a prebuilt C parser, and the parser itself is quite fast. Once you have the file parsed, it shouldn't be too hard to find all of the functions.

http://www.dabeaz.com/ply/

pavpanchekha
+1  A: 

To do this reliably, you'd need to parse the C or C++ code, and then grab the function definitions from the AST the parser produces.

C is fairly easy to parse. As pavpanchekha mentions, the parser PLY comes with a C parser, and has been used to make the following relevant projects:

Parsing C++ code is more complicated.. "Is there a good Python library that can parse C++" should be of help:

C++ is notoriously hard to parse. Most people who try to do this properly end up taking apart a compiler. In fact this is (in part) why LLVM started: Apple needed a way they could parse C++ for use in XCode that matched the way the compiler parsed it.

That's why there are projects like GCC_XML which you could combine with a python xml library.

Finally, if your code doesn't need to be robust at all, you could run the code though a code-reformatter, like indent (for C code) to even things out, then use regular expressions to match the function definition. Yes this is a bad, hacky, error-prone idea, and you'll probably find function definitions in multiline comments and such, but it might work well enough..

dbr
+1  A: 

This page, Parsing C++, mentions a couple of ANTLR grammars for C++. Since ANTLR has a Python API this seems like a reasonable way to proceed.

Even though parsing may seem a lot more complex that regular expressions, this is a case where someone else has done almost all the work for you and you just need to interface to it from Python.

Another alternative, where someone else has done the work of parsing C++ for you, is pygccxml which leverages GCCXML, an output extension for GCC to produce XML from the compilers internal representation. Since Python has great XML support, you just need to extract the information of interest to you.

Michael Dillon