views:

8156

answers:

14

What tools do you use to find unused/dead code in large java projects? Our product has been in development for some years, and it is getting very hard to manually detect code that is no longer in use. We do however try to delete as much unused code as possible.

Suggestions for general strategies/techniques (other than specific tools) are also appreciated.

Edit: Note that we already use code coverage tools (Clover, IntelliJ), but these are of little help. Dead code still has unit tests, and shows up as covered. I guess an ideal tool would identify clusters of code wich have very little other code depending on it, allowing for docues manual inspection.

+1  A: 

Eclipse can show/highlight code that can't be reached. JUnit can show you code coverage, but you'd need some tests and have to decide if the relevant test is missing or the code is really unused.

MattW.
Eclipse will only tell you if the scope of the method is local (ie. private); and even then you can't be 100% sure... with reflection private method could be called from the outside.
p3t0r
+2  A: 

There are tools which profile code and provide code coverage data. This lets you see (as code is run) how much of it is being called. You can get any of these tools to find out how much orphan code you have.

Vaibhav
+3  A: 

In theory, you can't deterministically find unused code. Theres a mathematical proof of this (well, this is a special case of a more general theorem). If you're curious, look up the Halting Problem.

This can manifest itself in Java code in many ways:

  • Loading classes based on user input, config files, database entries, etc;
  • Loading external code;
  • Passing object trees to third party libraries;
  • etc.

That being said, I use IDEA IntelliJ as my IDE of choice and it has extensive analysis tools for findign dependencies between modules, unused methods, unused members, unused classes, etc. Its quite intelligent too like a private method that isn't called is tagged unused but a public method requires more extensive analysis.

cletus
Thank you for your input. We are using IntelliJ, and are getting some help there.As for the Halting Problem and undecidability, I am familiar with the theory, but we do not necesarilly need a deterministic solution.
knatten
Opening sentence is too strong. As with the Halting Problem (also often misquoted/abused), there's no complete general solutions, but there are plenty of special cases that ARE feasible to detect.
joel.neely
While there isn't a general solution for languages with eval and/or reflection, there's lots of cases where code is provably unreachable.
pjc50
+11  A: 

We've started to use Find Bugs to help identify some of the funk in our codebase's target-rich environment for refactorings. I would also consider Structure 101 to identify spots in your codebase's architecture that are too complicated, so you know where the real swamps are.

Alan
+5  A: 

I would instrument the running system to keep logs of code usage, and then start inspecting code that is not used for months or years.

For example if you are interested in unused classes, all classes could be instrumented to log when instances are created. And then a small script could compare these logs against the complete list of classes to find unused classes.

Of course, if you go at the method level you should keep performance in mind. For example, the methods could only log their first use. I dont know how this is best done in Java. We have done this in Smalltalk, which is a dynamic language and thus allows for code modification at runtime. We instrument all methods with a logging call and uninstall the logging code after a method has been logged for the first time, thus after some time no more performance penalties occur. Maybe a similar thing can be done in Java with static boolean flags...

Adrian
I like this answer but does anyone have an idea how to do this in Java without explicitly adding the logging in every class? Maybe some 'Proxy' magic?
Outlaw Programmer
@Outlaw AOP seems to be perfect use case for this.
Pascal Thivent
If you understand the application's classloading structure, you could use AOP on the classloader to track classload events. This would be less invasive on a production system than advice before all constructors.
ShabbyDoo
+19  A: 

An Eclipse plugin that works reasonably well is Unused Code Detector.

It processes an entire project, or a specific file and shows various unused/dead code methods, as well as suggesting visibility changes (i.e. a public method that could be protected or private).

Mikezx6r
+1  A: 

User coverage tools, such as EMMA. But it's not static tool (i.e. it requires to actually run the application through regression testing, and through all possible error cases, which is, well, impossible :) )

Still, EMMA is very useful.

Vladimir Dyuzhev
+7  A: 

Use a test coverage tool to instrument your codebase, then run the application itself, not the tests.

Emma and Eclemma will give you nice reports of what percentage of what classes are run for any given run of the code.

jamesh
+2  A: 

Code coverage tools, such as Emma, Cobertura, and Clover, will instrument your code and record which parts of it gets invoked by running a suite of tests. This is very useful, and should be an integral part of your development process. It will help you identify how well your test suite covers your code.

However, this is not the same as identifying real dead code. It only identifies code that is covered (or not covered) by tests. This can give you false positives (if your tests do not cover all scenarios) as well as false negatives (if your tests access code that is actually never used in a real world scenario).

I imagine the best way to really identify dead code would be to instrument your code with a coverage tool in a live running environment and to analyse code coverage over an extended period of time.

If you are runnning in a load balanced redundant environment (and if not, why not?) then I suppose it would make sense to only instrument one instance of your application and to configure your load balancer such that a random, but small, portion of your users run on your instrumented instance. If you do this over an extended period of time (to make sure that you have covered all real world usage scenarios - such seasonal variations), you should be able to see exactly which areas of your code are accessed under real world usage and which parts are really never accessed and hence dead code.

I have never personally seen this done, and do not know how the aforementioned tools can be used to instrument and analyse code that is not being invoked through a test suite - but I am sure they can be.

Vihung
A: 
  • FindBugs is excellent for this sort of thing.
  • PMD (Project Mess Detector) is another tool that can be used.

However, neither can find public static methods that are unused in a workspace. If anyone knows of such a tool then please let me know.

graveca
+2  A: 

One thing I've been known to do in Eclipse, on a single class, is change all of its methods to private and then see what complaints I get. For methods that are used, this will provoke errors, and I return them to the lowest access level I can. For methods that are unused, this will provoke warnings about unused methods, and those can then be deleted. And as a bonus, you often find some public methods that can and should be made private.

But it's very manual.

skiphoppy
+3  A: 

"UCDetector (Unnecessary Code Detector) is a eclipse PlugIn tool to find unnecessary (dead) public java code."

http://www.ucdetector.org/

+1  A: 

IntelliJ has code analysis tools for detecting code which is unused. You should try making as many fields/methods/classes as non-public as possible and that will show up more unused methods/fields/classes

I would also try to locate duplicate code as a way of reducing code volume.

My last suggestion is try to find open source code which if used would make your code simpler.

Peter Lawrey
+1  A: 

The Structure101 slice perspective will give a list (and dependency graph) of any "orphans" or "orphan groups" of classes or packages that have no dependencies to or from the "main" cluster.

Chris Chedgey