views:

212

answers:

6

I need to know if it's possible to use a tool like ctags or cscope to find all the usages of a function but filter the results depending on the value of one of its parameters.

For example, let's assume we have a function void foo(int a, int b) that is used a thousand times along all the source tree and I need to check if it's being called with a literal 0 as the second argument... I almost can do it with KScope, where it's possible to search references and then filter the text with a regexp, but it's problematic if the function call is split in many lines (as only the first one is listed) or if it contains commentaries.

Any idea?

A: 

Why you cannot just use regex that will skip first argument and spaces/newline before first argument?

Something like foo[ \n\t]*\([^,]*[ \n\t]*0[ n\t]*\)

Just search the source, not a KScope output.

EDIT: Omitting comments is tricker, but still possible with regex

qrdl
Oops: ``foo(bar(42, 0), 4)`` -- but it's a good idea
pmg
I was trying it with grep, but then I realised that grep matches line by line :-(
fortran
@pmg Right, with nested calls it gets more complicated :(
qrdl
+4  A: 

This absolutely sounds like a job for the coccinelle program:

What is Coccinelle? Coccinelle is a program matching and transformation engine which provides the language SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code. Coccinelle was initially targeted towards performing collateral evolutions in Linux. Such evolutions comprise the changes that are needed in client code in response to evolutions in library APIs, and may include modifications such as renaming a function, adding a function argument whose value is somehow context-dependent, and reorganizing a data structure. Beyond collateral evolutions, Coccinelle is successfully used (by us and others) for finding and fixing bugs in systems code.

it uses semantic patches, i.e. a patch where the identifiers and operators are recogniced. I have barely started learning it, but I think what you are looking for are approximately this:

@@
@@

*foo(..., 0)

to identify all calls to foo with 0 as a second argument. The extremely cool feature of coccinelle is that you can get the program to modify the source based on a semantic patch, for instance replacing 0 as a second argument to the function foo with SOME_CONSTANT.

@@
@@

foo(...,
-0
+SOME_CONSTANT
)
hlovdal
Does this work with macro calls, as OP requested? I'd expect Coccinelle to have to expand all the preprocessor directives thus all macro calls are gone.
Ira Baxter
It does work on macros, and does parse the code in its own way, not first expanding the preprocessor directives.
hlovdal
A: 

I dont know that there is a tool that will do this directly for you. While there are several source browsers, not many of them will be designed specifically to find instances of foo with particular argument values. And, again, using tools like these will certainly not cover the cases already discussed: multi-line, comments, etc

The best thing to do, if this is something you need to do often, is to write a little program that reads N files and when it encounters the string foo captures all arguments from the initial '(' to the matching ')' including any nested '()' pairs and whitespace. Then you can check for the specific argument there or just print it out and grep the output for what you are after.

ezpz
+3  A: 

hlovdal is right but if you want to ensure that 0 is the second argument, you could use the following semantic match.

@@
expression e;
@@

*foo(e, 0)

To build a hyperlink list with the emacs org mode, use the following

@ r @
expression e;
position p;
@@

foo(e, 0@p)

@script:python@
p << r.p;
@@

cocci.print_main("",p)

Hope it helps.

Nico
I had already worked that out when I accepted the answer, but thanks anyway ;-)
fortran
A: 

The SD Source Code Search Engine (SCSE) can probably answer this for many langauges. SCSE breaks each file apart into lexemes according to its (programming) language lexical syntax (identifiers, literal numbers, keywords, operators, string literals, comments), indexes the file tokens based on the lexical classification. Line breaks/whitespace are ignored. After indexing, you can do searches across a very large source base very quickly, see the set of potential hits, and single-click to view any particular one.

To do the search suggested, the following SCSE command will do:

I=foo '(' ... N=0

which means an identifer ("I") named "foo" (whether macro or function call) followed by operator '(' within several tokens ("...") of a number ("N") token with value "0". (You can control what "several tokens" means; the default is 5.).

This will find calls to foo with whitespace/linebreaks and comments between it an the literal 0. Yes, it will find some false positives too, but in practice not very many.

Setting up SCSE merely requires you to make lists of interesting files according to language category ("ls -r *.ext" does most of of this), and then run the indexer. SCSE can be obtained with C, C#, C++, COBOL, Java, JavaScript, PHP, XML, and many other language lexeme extractors.

Ira Baxter
+2  A: 

@ Ira Baxter

Does this work with macro calls, as OP requested? I'd expect Coccinelle to have to expand all the preprocessor directives thus all macro calls are gone. – Ira Baxter

Coccinelle does not expand the preprocessor directives. Indeed, it is a source to source transformation tool which aims at preserving the coding style and conventions.

To properly handle your macros, you will thus have to give their definitions in a separate file with the "-macro_file" option.

Nico
How does it handle macros that span language syntax categories other than function calls, and yet still be able to understand the underlying program semantics? I've checked the Cocinelle web site, and I can't find any more discussion about this than what you said.Somehow it must be expanding the supplied macros?
Ira Baxter