tags:

views:

302

answers:

8

Does anyone know of a tool that I can use to find explicit C-style casts in code? I am refactoring some C++ code and want to replace C-style casts where ever possible.

An example C-style cast would be:

Foo foo = (Foo) bar;

In contrast examples of C++ style casts would be:

Foo foo = static_cast<Foo>(bar);
Foo foo = reinterpret_cast<Foo>(bar);
Foo foo = const_cast<Foo>(bar);
+8  A: 

The fact that such casts are so hard to search for is one of the reasons new-style casts were introduced in the first place. And if your code is working, this seems like a rather pointless bit of refactoring - I'd simply change them to new-style casts whenever I modified the surrounding code.

Having said that, the fact that you have C-style casts at all in C++ code would indicate problems with the code which should be fixed - I wouldn't just do a global substitution, even if that were possible.

anon
I have to conform to coding guidelines :)
waffleman
Mostly those guidelines are for new code typed. Old code will only be refactored when there are changes needed. For example a fix. Refactoring just for refactoring costs just time (and money) with no benifit.
PoweRoy
+1 fix them when you're working on/refactoring the surrounding code (and presumably retesting).
Mark B
@waffleman: As long as no one *sees* the casts, no one's going to object to them not conforming with your guidelines. So fix them when you (or someone else) find them. It seems a waste of time to actively search for them. Just don't write *more* of them, and fix the old ones as they come to your attention.
jalf
+3  A: 

Searching for the regular expression \)\w gives surprisingly good results.

sth
Exactly what I was thinking! http://xkcd.com/208/
ewall
That only works if you put the cast next to the variable... which you SHOULD, always, but not everyone does it.
Platinum Azure
A: 

One issue with C-style casts is that, since they rely on parentheses which are way overloaded, they're not trivial to spot. Still, a regex such as (e.g. in Python syntax):

r'\(\s*\w+\s*\)'

is a start -- it matches a single identifier in parentheses with optional whitespace inside the parentheses. But of course that won't catch, e.g., (void*) casts -- to get trailing asterisks as well,

r'\(\s*\w+[\s*]*\)'

You could also start with an optional const to broaden the net still further, etc, etc.

Once you have a good RE, many tools (from grep to vim, from awk to sed, plus perl, python, ruby, etc) lets you apply it to identify all of its matches in your source.

Alex Martelli
A: 

If you use some kind of hungarian style notation (e.g. iInteger, pPointer etc.) then you can search for e.g. )p and ) p and so on.

It should be possible to find all those places in reasonable time even for a large code base.

frunsi
+1  A: 

A tool that can analyze C++ source code accurately and carry out automated custom changes (e.g., your cast replacement) is the DMS Software Reengineering Toolkit.

DMS has a full C++ parser, builds ASTs and symbol tables, and can thus navigate your code to reliably find C style casts. By using pattern-directed matches and rewrites, you can provide a set of rules that would convert all such C-style casts into your desired C++ equivalents.

DMS has been used to carry out massive automated C++ reengineering tasks for Boeing and General Dynamics, each involving thousands of files.

Ira Baxter
You should make clear upfront when you're plugging your own product.
Matthew Flaschen
+5  A: 

The Offload C++ compiler supports options to report as a compile time error all such casts, and to restrict the semantics of such casts to a safer equivalence with static_cast.

The relevant options are:

-cp_nocstylecasts   

The compiler will issue an error on all C-style casts. C-style casts in C++ code can potentially be unsafe and lead to undesired or undefined behaviour (for example casting pointers to unrelated struct/class types). This option is useful for refactoring to find all those casts and replace them with safer C++ casts such as static_cast.

-cp_c2staticcasts   

The compiler applies the more restricted semantics of C++ static_cast to C-style casts. Compiling code with this option switched on ensures that C-style casts are at least as safe as C++ static_casts

This option is useful if existing code has a large number of C-style casts and refactoring each cast into C++ casts would be too much effort.

grrussel
Oh nice! A real parser for once :)
Matthieu M.
+11  A: 

If you're using gcc/g++, just enable a warning for C-style casts:

g++ -Wold-style-cast ...
jonner
+1 Use the best tool and the simplest approach for the job ... if that works.
Hamish Grubijan
A: 

I already answered once with a description of a tool that will find and change all the casts if you want it to.

If all you want to do is find such casts, there's another tool that will do this easily, and in fact is the extreme generalization of all the "regular expression" suggestions made here. That is the SD Source Code Search Engine. This tool enables one to search large code bases in terms of the language elements that make up each language. It provides a GUI allowing you enter queries, see individual hits, and show the file text at the hit point with one mouse click. One more click and you can be in your editor [for many editors] on a file. The tool will also record a list of hits in context so you can revisit them later.

In your case, the following search engine query is likely to get most of the casts:

'(' I ')'  | '(' I ... '*' ')'

which means, find a sequence of tokens, first being (, second being any identifier, third being ')', or a similar sequence involving something that ends in '*'.

You don't specify any whitespace management, as the tool understands the language whitespace rules; it will even ignore a comment in the middle of a cast and still match the above.

[I'm the CTO at the company that supplies this.]

Ira Baxter