views:

191

answers:

9

I've been assigned to do some work on a huge Java project, and the influence of several iterations of developers is obvious. There is no standard coding style, formatting, naming conventions or class structure. It's a good day when I come across a class with Javadoc, and unit testing is a happy daydream.

So far those of us on the project have been "blending in", adapting to the existing convention of whatever class we're working on, but the time is coming to impose some order and consistency.

It's a daunting challenge, and I'm looking for any advice people might have on such a task. Are there any strategies that have been particularly effective, or pitfalls to look out for? Is it even a good idea to try?

Edit to add: I don't want to give the impression that the project is bad - it is in fact solidly designed and largely well-written. It's just feeling it's age and the inevitability of maintenance...

+4  A: 

You can use a tool to impose a common format on the source code in the project. Aside from that, see Michael Feathers' Working Effectively with Legacy Code (where "legacy code" is defined to be "code without unit tests"), which describes how to gradually turn legacy code into fully-tested and -testable code.

Jeremy W. Sherman
+8  A: 

I find Eclipse to be an incredibly powerful tool for operations such as this.

A lot of people swear by command-line tools and modal-based text editors for programming but there are strong advantages of using a full IDE for major refactoring:

  • Automatic, real-time compilation shows you errors as they happen and everywhere they happen. Just because you make a change and nothing in the class or immediate package breaks does not mean that you haven't created issues elsewhere. Red flags will go up the package tree in eclipse leading you directly to them.
  • Graphical based renaming and moving. Renaming elements of your code can have an impact much larger than what you know. Eclipse will show you details of every instance of the element in question and how it will be changed by the rename.
  • Automatic import management allows you to take the work out of dealing with ensuring all your imports are in order. Eclipse will automatically add imports as they are used and mark unused ones with action lightbulbs for one-click removal.
  • Use code styles to ensure all of the source files use the same format for everything. Spaces, indents, new lines, parenthesis can all be formatted for you. This works as you create new code as well as for updating existing files.

In addition to Eclipse's toolset you might look in to utilizing other modern Java tools for ensuring your code is always functioning.

  • Test suites allow you to constantly ensure any changes you make don't have a negative effect on the function of the project. If you are going to be refactoring a feature, write two or three test cases that demonstrate the ways it works. Make sure they run before and after any changes. This is the easiest way to spot problems before they become an issue.
  • Use a tool such as Maven to assist with dependencies, testing, compiling, and deployments. Don't waste time doing any of the aforementioned tasks again. Focus on writing the code that does the work.

edit:

I also personally prefer Eclipse because I am the one doing the refactoring, not some automated tool that knows next to nothing about my code.

Jake Wharton
Great answer, also worth throwing in the FindBugs Eclipse plugin if you want to fix lots of bugs / inconsistencies while you refactor....
mikera
+1  A: 

My suggestion would be add something like Checkstyle to your build system. It's hard to get management to buy into the idea of doing a complete overhaul all at once. Design what you think are a good set of style guidelines and implement them in Checkstyle and add it to your build.

Then, require that all new check-in of code doesn't break Checkstyle. That means whenever you work on a class you'll bring it up to standards. It won't seem you're doing any extra work at all if it's just a little something you have to do before committing for a while.

Also, checkstyle plugins exist for Eclipse.

Mark Peters
A: 

It's a rather common task, not very joyful but neither a nightmare... It could be worse, if coded in other languages (Perl, PHP, C++, -gasp- VB... ); actually, Java is one the best for your scenario.

Get a decent IDE (Eclipse) and spend a good time understanding dependences and call cycles. It will take a long time to get familiar with everything, so try to make only small changes first.

When documentation is lacking, the IDE (and the static compiling) helps a lot to know who is using which class or method, and you can do refactoring quite confidently. But first try to identify in which layers/packages/classes is reflection used (explicitly by your code, or implicitly by your frameworks - eg. some getters and setters).

There are many books devoted to "Reengineering Legacy Software" and related issues.

leonbloy
+2  A: 

What I like to do in this situation is:

  1. Firstly convert the project to use a maven build, so that I know what version the dependencies are.
  2. This also gives me some decent code quality reports to use as a benchmark, including checkstyle, findbugs, pmd and code coverage.
  3. And I (and many others) are used to this structure, so we know where to find the source, unit tests, resources etc.
  4. If it is a large project then a maven multi-module project layout is probably the correct structure to use.
  5. If it is currently one big-ball-of-mud, then that becomes the core module which can later be refactored into separate modules.
  6. The standard maven directory structure provides place for, and therefore encourages unit tests.
  7. The unit tests are a critical pre-requisite before refactoring can begin.
  8. Establish a continuous integration build cycle using Hudson.
crowne
Pretty much the same I am doing with my assignment now, except steps 2 and 8.
Ither
+2  A: 

Start with the monolithic classes and break them up (greater than 500 statements excluding comments and lines with just braces). Introduce interfaces, then dependency injection.

Justin Dennahower
A: 

I had such experience. I agree with people that recommend maven build, eclipse, Checkstyle, refactoring of big classes etc. I understand that you cannot achieve full test coverage before your are starting to work. I'd recommend 1. reformat code in batch mode using checkstyle or similar tool 2. enable all reasonable warnings in Eclipse and refactor code that cause such warnings if this refactoring is trivial. In other cases put @SupressWarning and special TODO to come back to this code later. 3. use defect driven automatic testing, i.e. develop tests for module you are going to change.

Good luck!

AlexR
A: 

I also recoomend using the IDE's features to improve code quality. For eclipse this what i would do:

In the preferences java > code style > formatter - define your own format and add it. after that right click the project and source < clean up. Choose sustom profile and configure. There are plenty of things you can do here like code formatting cleaning up imports converting legacy for loops into enhanced ones cleaning up unused code and many more.

After that i would do the things other people here suggested as well like using checkstyle, pmd, findbugs and so on.

kukudas
A: 

I've been through this process a few times now, I've found that the solution requires knowing the following:

  • Will there be political unrest at the concept of fixing these things?
  • Is there now an accepted standard for how these things should look / be formatted?
  • Are there great test cases?

The political situation is the hardest to mitigate, fundamentally no one likes the idea of lateral movement, and going through the process of enforcing code formatting and naming conventions is very much a lateral movement. If you can come up with a solid set of metrics justifying your decision your lateral movement can be masqueraded as a forward movement. I've found that the best metrics here are along the lines of

"a consistent set of coding standards will result in: - 30% fewer bugs - 30% faster development - 80% lower maintenance cost - 100% of us coders will be far happier with this change"

Not just pulling these numbers out of the air is the trick. Be able to justify this.

Clearly there is no point starting this work unless you have buy in from the people currently adding to the project. Everyone must agree and begin retro fitting these ideals into the code that currently exists. Remember not everyone uses an IDE (e.g. I code all my java in VIM) and so you should make sure this format is dictated on a wiki for all to see (particularly new team members) and that the wiki page has downloads for the various editors in use.

Since it is very likely that we're not just talking about code formatting but also variable renaming and a change in patterns these affect the public api's of your classes so you really need to ensure you have a very stable set of test cases. If test cases are missing then you should always start from the outside in - model your tests such that they interact as your users do. Then you can go through and refactor with a degree of confidence. Once you have code that resembles your dreams you can go in and add tests closer to each object. Nothing is more painful that creating all your test cases, and then changing the API's and having to change all your test cases; every time I've seen that happen it results in a massive drop in test coverage.

Mike