views:

872

answers:

13

I've inherited a project where the class diagrams closely resemble a spider web on a plate of spaghetti. I've written about 300 unit tests in the past two months to give myself a safety net covering the main executable.

I have my library of agile development books within reach at any given moment:

  • Working Effectively with Legacy Code
  • Refactoring
  • Code Complete
  • Agile Principles Patterns and Practices in C#
  • etc.

The problem is everything I touch seems to break something else. The UI classes have business logic and database code mixed in. There are mutual dependencies between a number of classes. There's a couple of god classes that break every time I change any of the other classes. There's also a mutant singleton/utility class with about half instance methods and half static methods (though ironically the static methods rely on the instance and the instance methods don't).

My predecessors even thought it would be clever to use all the datasets backwards. Every database update is sent directly to the db server as parameters in a stored procedure, then the datasets are manually refreshed so the UI will display the most recent changes.

I'm sometimes tempted to think they used some form of weak obfuscation for either job security or as a last farewell before handing the code over.

Is there any good resources for detangling this mess? The books I have are helpful but only seem to cover half the scenarios I'm running into.

+14  A: 

It sounds like you're tackling it in the right way.

  • Test
  • Refactor
  • Test again

Unfortunately, this can be a slow and tedious process. There's really no substitute for digging in and understanding what the code is trying to accomplish.

One book that I can recommend (if you don't already have it filed under "etc.") is Refactoring to Patterns. It's geared towards people who are in your exact situation.

Bill the Lizard
+10  A: 

I'm working in a similar situation.

If it is not a small utility but a big enterprise project then it is:

a) too late to fix it
b) beyond the capabilities of a single person to attempt a)
c) can only be fixed by a complete rewriting of the stuff which is out of the question

Refactoring can in many cases be only attempted in your private time at your personal risk. If you don't get an explicit mandate to do it as part of you daily job then you're likely not even get any credit for it. May even be criticized for "pointlessly wasting time on something that has perfectly worked for a long time already".

Just continue hacking it the way it has been hacked before, receive your paycheck and so on. When you get completely frustrated or the system reaches the point of being non-hackable any further, find another job.

EDIT: Whenever I attempt to address the question of the true architecture and doing the things the right way I usually get LOL in my face directly from responsible managers who are saying something like "I don't give a damn about good architecture" (attempted translation from German). I have personally brought one very bad component to the point of non-hackability while of course having given advanced warnings months in advance. They then had to cancel some promised features to customers because it was not doable any longer. Noone touches it anymore...

User
Hey look -- my biography from 2005 to 2008!
Austin Salonen
That's how the system got this way in the first place.
Ali A
Well, that's quite literally their business. If they know it can't be safely modified any more without a refactor effort, and they don't want to do so, that's their decision.
T.E.D.
Too many times you have to follow the golden rule: until it works don't even think about touching it ;-) Said but true...
Alekc
+1  A: 

Mostly, that sounds pretty bad. But I don't understand this part:

My predecessors even thought it would be clever to use all the datasets backwards. Every database update is sent directly to the db server as parameters in a stored procedure, then the datasets are manually refreshed so the UI will display the most recent changes.

That sounds pretty close to a way I frequently write things. What's wrong with this? What's the correct way?

recursive
I think he means that they didn't wire the datasets to do the database updates and instead wrote ad hoc SQL in the presentation layer to handle them. You end up with either duplicated code everywhere or a "database updates" class that does what the dataset could be wired to do.
Austin Salonen
I haven't been able to achieve the paradise of code-free databinding in non-trivial CRUD apps. There are always some business rules that I haven't been able to express adequately without breaking down and writing some code the goold ol' fashioned way. But I might be doing it wrong.
recursive
+1  A: 

Anti corruption layer (Evan, Domain Driven Design) : Access the crap behind a facade See here

Nicolas Dorier
Is there more info about this? I asked the same question here: http://stackoverflow.com/questions/909264/ddd-anti-corruption-layer-how-to
Arnis L.
+3  A: 

I have (once) come across code that was so insanely tangled that I couldn't fix it with a functional duplicate in a reasonable amount of time. That was sort of a special case though, as it was a parser and I had no idea how many clients might be "using" some of the bugs it had. Rendering hundreds of "working" source files erronious was not a good option.

Most of the time it is immentently doable, just daunting. Read through that refactoring book.

I generally start fixing bad code by moving things around a bit (without actually changing implementation code more than required) so that modules and classes are at least somewhat coherent.

When that is done, you can take your more coherent class and rewrite its guts to perform the exact same way, but this time with sensible code. This is the tricky part w/ management, as they generally don't like to hear that you are going to take weeks to code and debug something that will behave exactly the same (if all goes well).

During this process I guarantee you will discover tons of bugs, and outright design stupidities. Its OK to fix trivial bugs while recoding, but otherwise leave such things for later.

Once this is done with a couple of classes, you will start to see where things can be modularized better, designed better, etc. Plus it will be easier to make such changes without impacting unrelated things because the code is now more modular, and you probably know it thouroughly.

T.E.D.
+1  A: 

No book will be able to cover all possible scenarios. It also depends on what you'll be expected to do with the project and whether there is any kind of external specification.

  • If you'll only have to do occasional small changes, Just do those and don't bother starting to refactor.
  • If there is a spec (or you can get someone to write it), consider a complete rewrite if it can be justified by the foreseeable amount of changes to the project
  • If "the implementation is the spec" and there are a lot of changes planned, then you're pretty much hosed. Write LOTS of Unit tests and start refactoring in small steps.

Actually, Unit tests are going to be invaluable no matter what you do (if you can write them to an interface that's not going to change much with refactorings or a rewrite, that is).

Michael Borgwardt
A: 

If your refactorings are breaking code, particularly code that seems to be unrelated, then you're trying to do too much at a time.

I recommend a first-pass refactoring where all you do is ExtractMethod: the goal is simply to name each step in the code, without any attempts at consolidation whatsoever.

After that, think about breaking dependencies, replacing singletons, consolidation.

kdgregory
+3  A: 

I've worked this job before. I spent just over two years on a legacy beast that is very similar. It took two of us over a year just to stabilize everything (it's still broke, but it's better).

First thing -- get exception logging into the app if it doesn't exist already. We used FogBugz and it took us about a month to get reporting integrated into our app; it wasn't perfect right away but it was reporting errors automatically. It's usually pretty safe to implement try-catch blocks in all your events and that will cover most of your errors.

From there fix the bugs that come in first. Then fight the small battles, especially those based on the bugs. If you fix a bug that unexpectedly affects something else, refactor that block so that it is decoupled from the rest of the code.

It will take some extreme measures to rewrite a big, critical-to-company-success application no matter how bad it is. Even you get permission to do so, you'll be spending too much time supporting the legacy app to make any progress on the rewrite anyway. If you do many small refactorings, eventually either the big ones won't be that big or you'll have really good foundation classes for your rewrite.

One thing to take away from this is that it is a great experience. It will be frustrating but you will learn a lot.

Austin Salonen
This is a great comment. However I would disagree that you will learn a lot. You'll likely only learn the ways to do things wrong. Depending on how resistant you are, you will either learn something from it or just degrade and forget how to do things right.
User
Seeing a poor implementation is a good learning experience. It takes a high caliber individual to not degrade. However, I would say it was people who did forget how to do things right that get you to situations like this to begin with.
Austin Salonen
"One thing to take away from this is that it is a great experience. It will be frustrating but you will learn a lot." - i got to remember those words every time i want to scream... thanks.
Arnis L.
A: 

If your refactorings are breaking things, then it means you don't have adequate unit test coverage - as the unit tests should have broken first. I recommend you get better unit test coverage second, after getting exception logging into place.

I then recommend you do small refactorings first - Extract Method to break large methods into understandable pieces; Introduce Variable to remove some duplication within a method; maybe Introduce Parameter if you find duplication between the variables used by your callers and the callee.

And run the unit test suite after each refactoring or set of refactorings. I'd say run them all until you gain confidence about which tests will need to be rerun every time.

John Saunders
It sounds like he's detecting the breakages, it's just hard to actually do anything without breaking something else.
Sam Hoice
If so, then I apologize. It sounded like usage and QA were finding the breakages, when it's the unit tests that should do so.
John Saunders
A: 

Good luck, that is the tough part of being a developer.

I think your approach is good, but you need to focus on delivering business value (# of unit tests is not a measure of business value, but it may give you an indication if you are on or off track). Its important to have identified the behaviors that need to be changed, prioritize, and focus on the top ones.

The other piece of advise is to remain humble. Realize that if you wrote something so large under real deadlines and someone else saw your code, they would probably have problems understanding it as well. There is a skill in writing clean code, there is a more important skill in dealing with other people's code.

Last piece of advise is to try to leverage the rest of your team. Past members may know information about the system you can learn, also they may be able to help test behaviors. I know the ideal is to have automated tests, but if someone can help by verifying things for you manually consider getting their help.

Frank Schwieterman
A: 

I particularly like the diagram in Code Complete, in which you start with just legacy code, a rectangle of fuzzy grey texture. Then when you replace some of it, you have fuzzy grey at the bottom, solid white at the top, and a jagged line representing the interface between the two.

That is, everything is either 'nasty old stuff' or 'nice new stuff'. One side of the line or the other.

The line is jagged because you're migrating different parts of the system at different rates.

As you work, the jagged line gradually descends, until you have more white than grey, and eventually just grey.

Of course, that doesn't make the specifics any easier for you. But it does give you a model you can use to monitor your progress. At any one time you should have a clear understanding of where the line is: which bits are new, which are old, and how the two sides communicate.

slim
A: 

You might find the following post useful: http://blog.refactoringin.net/?p=36

As it is said in the post, don't discard complete overwrite that easily. Also, if at all possible, try to replace whole layers or tiers with third party solution like for example ORM for persistence or with new code. But most important of all, try to understand the logic (problem domain) behind the code.

Dan
A: 

You could extract and then refactor some part of it, to break the dependencies and isolate layers into different modules, libraries, assemblies, directories. Then you re-inject the cleaned parts in to the application with a strangler application strategy. Lather, rinse, repeat.

philippe