tags:

views:

940

answers:

7
+5  Q: 

Parse C files

I am looking for a Windows based library which can be used for parsing a bunch of C files to list global and local variables. The global and local variables may be declared using typedef. The output (i.e. list of global and local variables) can then be used for post processing (e.g. replacing the variable names with a new name).

Is such a library available?

+7  A: 

Some of the methods available:

Alternately you could write your own using lex and yacc (or their kin- flex and bison) using a public lex specification and a yacc grammar.

luke
See also: http://code.google.com/p/pycparser/
Jouni K. Seppänen
Thanks, I'll add that to the list.
luke
+2  A: 

Possibly overkill, but there's a complete ANSI C parser written with Boost.Spirit: http://spirit.sourceforge.net/repository/applications/c.zip

Maybe you'll be able to model it to suit your needs.

Martin Cote
+1  A: 

I don't know if it offers a library, but have a look at CTAGS.

kgiannakakis
A: 

If it is plain C, lex and yacc are your friends, but you need to take on account C preprocessor - source files with unexpanded macros typically are do not comply with C syntax so parser, written with K&R grammar in mind, most likely will fail.

If you decide to parse the output of preprocessor, be prepared that your parser will fail due to "extensions" of your particular compiler, because very likely standard library headers use them. At least this the the case with GCC.

I had this with GCC and finally decided to achieve my goal using different approach. If you just need to change names for variables, regular expressions will do fine, and there is no need to build a full parser, IMHO. If your goal is just to collect data, the ultimate source of data is debug information. There are ways to get debug information out of binary - for ELF executables with DWARF there is libdwarf, for Windows-land (COFF ?) should be something as well. Probably you can use some existing tools to get debug information about binary - again, I know nothing about Windows, you need to investigate.

qrdl
A: 

I recently read about a win32-based system that looked at the debugging information in COFF dlls: http://www.drizzle.com/~scottb/gdc/fubi-paper.htm

Paul Harrington
+1  A: 

Parsing C is lot harder than it looks, when you take into account different dialects, preprocessor directives, the need for type information while parsing, etc. People that tell you "just use lex and yacc" have clearly not done a production C parser.

A tool that can do this is the Semantic Designs C front end. http://www.semdesigns.com/Products/FrontEnds/CFrontEnd.html It addresses all of the above issues.

On completion, it has a complete, navigable symbol table with all identifiers and corresponding type information. Listing global and local variables would be trivial with this.

I'm the architect behind Semantic Designs.

Ira Baxter
A: 

Maybe gnu project cflow http://www.gnu.org/software/cflow/ ?

Grzegorz Wierzowiecki