"Cosmetic" clean-up of old, unknown code. Which steps, which order? How invasive?

views:

209

answers:

+6 Q:

"Cosmetic" clean-up of old, unknown code. Which steps, which order? How invasive?

When I receive code I have not seen before to refactor it into some sane state, I normally fix "cosmetic" things (like converting StringTokenizers to String#split(), replacing pre-1.2 collections by newer collections, making fields final, converting C-style arrays to Java-style arrays, ...) while reading the source code I have to get familiar with.

Are there many people using this strategy (maybe it is some kind of "best practice" I don't know?) or is this considered too dangerous, and not touching old code if it is not absolutely necessary is generally prefered? Or is it more common to combine the "cosmetic cleanup" step with the more invasive "general refactoring" step?

What are the common "low-hanging fruits" when doing "cosmetic clean-up" (vs. refactoring with more invasive changes)?

+7 A:

In my opinion, "cosmetic cleanup" is "general refactoring." You're just changing the code to make it more understandable without changing its behavior.

I always refactor by attacking the minor changes first. The more readable you can make the code quickly, the easier it will be to do the structural changes later - especially since it helps you look for repeated code, etc.

I typically start by looking at code that is used frequently and will need to be changed often, first. (This has the biggest impact in the least time...) Variable naming is probably the easiest and safest "low hanging fruit" to attack first, followed by framework updates (collection changes, updated methods, etc). Once those are done, breaking up large methods is usually my next step, followed by other typical refactorings.

Reed Copsey 2010-09-23 15:39:06

+2 A:

First thing I do is trying to hide most of the things to the outside world. If the code is crappy most of the time the guy that implemented it did not know much about data hiding and alike.

So my advice, first thing to do:

Turn as many members and methods as private as you can without breaking the compilation.

As a second step I try to identify the interfaces. I replace the concrete classes through the interfaces in all methods of related classes. This way you decouple the classes a bit.

Further refactoring can then be done more safely and locally.

jdehaan 2010-09-23 15:40:01

+2 A:

You can buy a copy of Refactoring: Improving the Design of Existing Code from Martin Fowler, you'll find a lot of things you can do during your refactoring operation.

Plus you can use tools provided by your IDE and others code analyzers such as Findbugs or PMD to detect problems in your code.

Resources :

On the same topic :

Colin Hebert 2010-09-23 15:40:50

Yes, much much better to *also* run your code through tools such FindBugs, PMD, etc.

ssahmed555 2010-09-23 16:01:45

+1 A:

I don't normally bother going through old code looking for problems. However, if I'm reading it, as you appear to be doing, and it makes my brain glitch, I fix it.

Common low-hanging fruits for me tend to be more about renaming classes, methods, fields etc., and writing examples of behaviour (a.k.a. unit tests) when I can't be sure of what a class is doing by inspection - generally making the code more readable as I read it. None of these are what I'd call "invasive" but they're more than just cosmetic.

Lunivore 2010-09-23 15:43:35

+2 A:

You're on the right track. By doing the small fixes you'll be more familiar with the code and the bigger fixes will be easier to do with all the detritus out of the way.

Run a tool like JDepend, CheckStyle or PMD on the source. They can automatically do loads of changes that are cosemetic but based on general refactoring rules.

Kelly French 2010-09-23 15:44:05

+4 A:

There is no right or wrong answer here, as this depends largely on circumstances.

If the code is live, working, undocumented, and contains no testing infrastructure, then I wouldn't touch it. If someone comes back in the future and wants new features, I will try to work them into the existing code while changing as little as possible.

If the code is buggy, problematic, missing features, and was written by a programmer that no longer works with the company, then I would probably redesign and rewrite the whole thing. I could always still reference that programmer's code for a specific solution to a specific problem, but it would help me reorganize everything in my mind and in source. In this situation, the whole thing is probably poorly designed and it could use a complete re-think.

For everything in between, I would take the approach you outlined. I would start by cleaning up everything cosmetically so that I can see what's going on. Then I'd start working on whatever code stood out as needing the most work. I would add documentation as I understand how it works so that I will help remember what's going on.

Ultimately, remember that if you're going to be maintaining the code now, it should be up to your standards. Where it's not, you should take the time to bring it up to your standards - whatever that takes. This will save you a lot of time, effort, and frustration down the road.

Erick Robertson 2010-09-23 15:46:02

+1 for leaving code alone unless it's functionally broken or requires a new feature. I've seen a lot of time wasted on cosmetics when there's a backlog of real problems to solve.

Corbin March 2010-09-23 15:51:52

+1 for _If the code is live,..._

Tony Ennis 2010-09-23 16:03:54

+2 A:

By starting with "cosmetic cleanup" you get a good overview of how messy the code is and this combined with better readability is a good beginning.

I always (yeah, right... sometimes there's something called a deadline that mess with me) start with this approach and it has served me very well so far.

Fredrik Norlin 2010-09-23 15:49:36

+2 A:

I do not change old code except to reformat it using the IDE. There is too much risk of introducing a bug - or removing a bug that other code now depends upon! Or introducing a dependency that didn't exist such as using the heap instead of the stack.

Beyond the IDE reformat, I don't change code that the boss hasn't asked me to change. If something is egregious, I ask the boss if I can make changes and state a case of why this is good for the company.

If the boss asks me to fix a bug in the code, I make as few changes as possible. Say the bug is in a simple for loop. I'd refactor the loop into a new method. Then I'd write a test case for that method to demonstrate I have located the bug. Then I'd fix the new method. Then I'd make sure the test cases pass.

Yeah, I'm a contractor. Contracting gives you a different point of view. I recommend it.

Tony Ennis 2010-09-23 16:01:07

+1 A:

From experience it depends on two things: time and risk.

If you have plenty of time then you can do a lot more, if not then the scope of whatever changes you make is reduced accordingly. As much as I hate doing it I have had to create some horrible shameful hacks because I simply didn't have enough time to do it right...

If the code you are working on has lots of dependencies or is critical to the application then make as few changes as possible - you never know what your fix might break... :)

It sounds like you have a solid idea of what things should look like so I am not going to say what specific changes to make in what order 'cause that will vary from person to person. Just make small localized changes first, test, expand the scope of your changes, test. Expand. Test. Expand. Test. Until you either run out of time or there is no more room for improvement!

BTW When testing you are likely to see where things break most often - create test cases for them (JUnit or whatever).

EXCEPTION: Two things that I always find myself doing are reformatting (CTRL+SHFT+F in Eclipse) and commenting code that is not obvious. After that I just hammer the most obvious nail first...

BigMac66 2010-09-23 18:01:30

+3 A:

The lowest-hanging cosmetic fruit is (in Eclipse, anyway) shift-control-F. Automatic formatting is your friend.

Carl Manaster 2010-09-23 19:17:55

+2 A:

There is one thing you should be aware of. The code you are starting with has been TESTED and approved, and your changes automatically means that that retesting must happen as you may have inadvertently broken some behaviour elsewhere.

Besides, everybody makes errors. Every non-trivial change you make (changing StringTokenizer to split is not an automatic feature in e.g. Eclipse, so you write it yourself) is an opportunity for errors to creep in. Do you get the exact behaviour right of a conditional, or did you by mere mistake forget a !?

Hence, your changes implies retesting. That work may be quite substantial and severely overwhelm the small changes you have done.

Thorbjørn Ravn Andersen 2010-09-24 06:16:55

ansaurus

tags:

views:

answers:

"Cosmetic" clean-up of old, unknown code. Which steps, which order? How invasive?

related questions