views:

89

answers:

4

I just been given a new assignment which looks like its going to be an interesting challenge.

The customer is wanting a code style checking tool to be developed for their internal (soon to be open sourced) programming language which runs on the JVM. The language syntax is very Java like.

The customer basically wants me to produce something like checkstyle.

So my question is this, how would you approach this problem? Given a clean slate what recommendations would you make to the customer?

I think I have 3 options

  1. Write something from scratch. Id prefer not to do this as it seems like this sort of code analysis tool problem has been solved so many times that there must be a more "framework" or "platform" orientated approach.

  2. Fork an existing code style checking tool and modify the parsing to fit with this new language etc etc

  3. Extend or plug into an existing static code analysis tool. (maybe write a plugin for Yasca?)

Maybe you would like to share your experiences in this area?

Thanks for reading

A: 

Take a look at FindBugs

webdestroya
Yep, FindBugs, PMD checkstyle etc etcThe docs state that its extend-able, but it looks like all of the magic is done at the byte-code level. So out of the box this could detect issues in the generated byte code but then it might be quite difficult to map these errors to the source code of this new language.
tinny
+4  A: 

Such tools basically have to implement a compiler front-end for at least a subset of the language. The easiest starting point is often to adapt an existing compiler front-end, so you should definitely start by looking at your customer's compiler. If you are lucky it will have a clean separation between the front-end and back-end and will be able to use it as-is and use the AST or whatever IR the front-end produces to do your additional analysis.

Christopher Barber
Yeah, or use a parser generator if this is not possible.
Longpoke
+1  A: 

You don't want to write all this stuff from scratch.

See the DMS Software Reengineeering Toolkit. This has generalized compiler machinery for parsing, building ASTs, constructing symbol tables, constructing/traversing control flow and data flow graphs and call trees.

DMS can be obtained with a full Java front end that builds ASTs, symbol tables and the flow analyses above. DMS handles language dialects with aplomb, so it should be as straightforward as practical to modify this front end to match your customer's Java-variant language and yet acquire all this analysis machinery.

Ira Baxter
+1  A: 

What about PMD? Ive used PMD for years but never really drilled down into its inner workings before.

PMD can be extended by writing a custom language parser, which is done by providing implementations of the following within a JAR on the class path.

net.sourceforge.pmd.cpd.Language
net.sourceforge.pmd.cpd.Tokenizer

http://pmd.sourceforge.net/cpd-parser-howto.html

Then by using the PMD rule designer I can define rules from the resulting AST.

The thing I like about PMD is that its a broadly recognised code analysis tool in the Java space so has lots of third party support. E.g Eclipse plugin, Hudson CI plugin etc etc

tinny