views:

195

answers:

9

Possible Duplicates:
What should I keep in mind in order to refactor huge code base?
When is it good (if ever) to scrap production code and start over?

I am currently working with some legacy source code files. They have quite a few problems because they were written by a database expert who does not know much about Java. For instance,

  1. Fields in classes are public. No getters and setters.
  2. Use raw types, not parameterized types.
  3. Use static unnecessarily.
  4. Super long method names.
  5. Methods need too many parameters.
  6. Repeat Yourself frequently.

I want to modify them so that they are more object-oriented. What are some best practices and effective/efficient approaches?

+7  A: 

Read "Working Effectively with Legacy Code" by Michael Feathers. Great book - and obviously it'll be a lot more detailed than answers here. It's got lots of techniques for handling things sensibly.

It looks like you've identified a number of issues, which is a large part of the problem. Many of those sound like they can be fixed relatively easily - it's overall design and architecture which is harder to do, of course.

Are there already unit tests, or will you be adding those too?

Jon Skeet
No unit tests yet. Probably I will write some testing scripts in future.
zihaoyu
Try writing some tests before you make code changes, though you may have to refactor without a net a bit first, as described in this highly recommended book.
Don Roby
+1  A: 

Eclipse should be able to take care of #1 and help you work your way through many of the others.

As for converting poor OO code to good OO code it is amazingly difficult. Often it seems easier to rewrite it from scratch.

I tend to go from the bottom up. As I'm working on some small section I'll recognize a bunch of data that belongs together as a group and I'll make a good object that replaces that code without changing anything else--Very Small Changes with constant tests between each change.

This makes for a mediocre design at best, but I honestly don't know if you can go from not OO to good OO on a large project without dissecting the original program, understanding it and using it as a template for the rewrite and few projects allow this (even though it might be faster, you'll rarely if ever be able to convince management of that fact)

Bill K
A: 

What's the problem with it not having getters and setters? I'd suggest refactoring those only when you need to add non-trivial getters or setters (e.g., with validation).

The rest sounds like you need to identify groups of values and create new types holding them, so instead of passing a String name, String address, int yearOfBirth, String[] accountNames, int[] balances you would pass a Customer around, which would in turn have an Account[].

IDEA Ultimate Edition has a code duplication detector that's very good (it's only missing a 'suggested solution' button!), and there are CPD etc.

I'd suggest that in a large legacy codebase you might waste time refactoring code only to find out it wasn't used anyway. I outlined some steps for removing unused code: http://rickyclarkson.blogspot.com/2009/12/deleting-code-what-first.html

Ricky Clarkson
The problem with not having getters and setters is that internal implementation is exposed to the outside world. You will not be able to change implementation in the future without breaking compatibility. So "minor" change may require fixing few other classes...
Paweł Dyda
Changing other classes is fine, at least if this code is not part of a public API. Even if it is part of a public API, it's not necessarily a big problem.
Ricky Clarkson
+3  A: 

Before you start, create a system-level regression test suite for the application. You need this so that you can verify that your changes don't break things.

To do the refactoring, you want a use a combination of a good IDE, and text search tool (e.g. grep). Use the text search tool to find occurrences of the "syndromes" that you want to fix, then use the IDE (and its builtin refactoring capabilities) to fix the instances ... one at a time.

For example, Eclipse allows you to rename a method or class, or generate getters and setters. So you'd cure a 'public' attribute by:

  1. Change the attribute to private.
  2. Generate the getter and setter methods.
  3. Save the file.
  4. Go through all of the Java compilation errors resulting from the fact that the attribute is now private, and change to use the getter or setter as appropriate.

This approach will give you the low-hanging fruit. More fundamental design issues are more difficult, and may be impossible to fix without fundamental restructuring of the application. The refactoring capabilities will help you execute such changes, but deciding what to do is ultimately up to you.

Finally, my advice is to not be too ambitious. Go for incremental improvement, and be prepared to draw the line when the code is "good enough". You won't achieve perfection ... not even if you start from a clean slate ... so don't set your expectations high.

Stephen C
+1 for full regression test!
James Anderson
The 4 steps listed could be more simply performed with Eclipse's **Encapsulate Field** refactoring.
Carl Manaster
A: 

The point is risk I think.

The ugly code is just ugly, but it could work, it has been tested and bugfixed. If runnable code is changed, risk will follow. so test is critical.

  • You could refactor related code when you have to bugfix, as a conservatism solution.

Maybe the first challenge is to persuade your manager:)

卢声远 Shengyuan Lu
A: 

How many of those "issues" are real problems and not just matters of style? Of this list, the only 'real' issue I can see is "Repeat Yourself frequently", and that's more of an ongoing maintenance problem that should be resolved during normal code maintenance when someone's going to be changing the code anyway.

Tim Kuehn
A: 

I want to modify them so that they are more object-oriented.

Object-Orientation should not be your only goal when refactoring. The question you should ask yourself is what is the expected ROI (better quality ? easier enhancements ? better sharing of this code across a team ?) A ROI is not just words, you should be prepared to measure with numbers the return on investment (even the quality enhancements for example). You should take into accounts the life duration of your products in estimating the ROI.

You should also ask yourself what is the size of the code which is dependent on the code you want to refactor. Refactoring a library could be easy but could lead to a lot of changes in source codes dependent on this library, a work well larger than just refactoring the library.

Before touching any code, you should estimate the total work that needs to be done to finish refactoring the code and dependent code. You should estimate a total rewrite of the code, a partial rewrite, or just an internal rewrite without touching APIs.

With the costs and returns, you could decide if it's worth the effort to refactor your code.

Jérôme Radix
A: 

I attended a very good session at Javazone the other day where one of the developers of Jackbox recorder described what it does and how they used it when refactoring legacy code.

Jackbox recorder lets you record the input and output of some API to retain the expected functionality of the API, once you have a solid recording you can muck about with the code without fear of introducing some logic error.

http://github.com/oc/jackbox

BjornS
+1  A: 

Is it just the code that is bad, or does it also hurt the user experience? Refactoring continuously is a good idea, but it should not be a goal unto itself. It should improve the application in terms of user interaction, maintainability, stability, performance, etc.

That is why I am not extremely fond of huge refactoring just to improve the code quality. Instead, refactor the code that you work with.

While working with a legacy system for several years, I have personally found that:

  1. Create for yourself a vision of how you want the code after you're done. It should be attainable, contain a list of technology changes, general architecture changes. It may also be a good idea to make a rough priority of what classes are most critical to change. We lacked such a vision a few years ago, and while we refactored a lot, the code quality barely improved.
  2. Now, you should restrict your refactoring to those that make you reach your vision. Don't fall into the trap of doing what appears good at the moment.
  3. Focus on a particular component, and make it better. Then move on to the next. It's tempting to make huge changes that affect the entire system, but in truth you will introduce more problems than you solve.
  4. Write integration regression tests. I.e., a few big tests that test a lot of functionality. It's not optimal, but it's the best you can do. Writing unit tests for every single class in your old system may end up a waste of time because it's not designed to be tested anyway and you want to redesign half of the classes.
  5. Accept that it will take time.
waxwing