views:

631

answers:

15

Hi all,

I am a junior software engineer who've been given a task to take over a old system. This system has several problems, based on my preliminary assessment.

  1. spaghetti code
  2. repetitive code
  3. classes with 10k lines and above
  4. misuse and over-logging using log4j
  5. bad database table design
  6. Missing source control -> I have setup Subversion for this
  7. Missing documents -> I have no idea of the business rule, except to read the codes

How should I go about it to enhance the quality of the system and resolve such issues? I can think of using static code analysis software to resolve any bad coding practice.

However, it can't detect any bad design issues or problems. How should I go about resolving these issues step by step?

+6  A: 

Write some unit tests first, and make sure they pass. Then with each refactoring change you make, just keep making sure the tests keep passing. Then you can be confident that your application behaviour to the outside world hasn't changed.

This also has the added benefit that the tests will always be there, so for any future changes the tests should still pass, guarding against any regressions in the new changes.

Noel M
lack of documentation, I can't verify the business rules except for reading codes and by word of mouth.
liangteh
Talk to the other developers who have been working with it longer, and work with them to write the tests.
Noel M
Does *no one* at the company know the business rules? Then who's to say whether your code is right or wrong? Flip side: if you fail to properly implement that which cannot be described, will you be harshly blamed? It sounds like a no-win scenario to me.
Philip Kelley
@liangteh, if you don't know the business rules, you can write characterization tests that verify the functionality as it currently exists. Write a test that fails, observe how it fails and modify the test to pass for the current code base. This won't tell you if the code is working 'correctly', but it will tell you if changes you're making are changing the behavior.
Dan Bryant
well, my colleagues know the requirements partially; till the best of their knowledge and how much they can recall from memory. =(
liangteh
@liangteh Have a partial understanding is bad (my current world of pain), try to arrange some meeting with stakeholders to understand what it does and what they expect of you.
mlk
+3  A: 

First and foremost, make sure you have source control system installed and all source code is versioned and can be built.

Next, you can try writing unit test for core parts of your system. From there, when you have a more or less solid body of regression tests, you can actually proceed with refactoring.

When I encounter messy codebase, I usually start with renaming poorly-named types and methods to better reflect their initial intent. Next you can try splitting huge methods into smaller ones.

Anton Gogolev
you are right. the first and foremost task I done, is to setup subversion for source control.
liangteh
While subversion may be sufficient, git provides local branching features that make development a lot easier.
Peter DeWeese
A: 

A good book on this subject is Working Effectively with Legacy Code By Michael Feathers (2004). It goes through the process of making small changes, while working towards a bigger clean up.

  1. Write unit test & Find and remove duplicate code.
  2. Write unit test & Break long methods into a series of short methods.
  3. Write unit test & Find and remove duplicate method.
  4. Write unit test & Break apart classes so that the follow the single responsibility principle.
tylermac
well, I have been reading up GoF design patterns. But it seems, I cant really identify the how and the right design pattern to use. primary reply on findbugs to pinpoint bad coding
liangteh
don't use design pattern. Just eliminate code smell for starters.
Dan
+15  A: 

Get and read Working Effectively With Legacy Code. It deals exactly with this situation.

As others have also advised, for refactoring you need a solid set of unit tests. However, legacy code is typically very difficult to unit test as is, since it has not been written to be unit testable. So you need to refactor first to allow unit testing, which would allow you to start refactoring... a bad catch.

This is where the book will help you. It gives lots of practical advice on how to make badly designed code unit testable with the minimal, and safest possible, code changes. Automatic refactorings can also help you here, but there are tricks described in the book which can only be done by hand. Then once the first set of unit tests are in place, you can start gradually refactoring towards better, more maintainable code.

Update: For hints on how to take over legacy code, you may find this earlier answer of mine useful.

As @Alex noted, unit tests are also very useful to understand and document the actual behaviour of the code. This is especially useful when documentation about the system is nonexistent or outdated.

Péter Török
Test it to understand its behavior. Not only will it give you confidence to make changes, but other programs may be depending on BUGS in the current system!
Alex Feinman
A: 

Working Effectively With Legacy Code might be helpful.

duffymo
I found another book Refactoring to patterns is wonderful as well.
liangteh
+2  A: 

Keep in mind that this legacy system, with all it's spaghetti code, currently works. Don't go changing things just because they don't look as pretty as they should. Focus on stability, new features & familiarity before ripping old code out left right and centre.

Matt Jacobsen
A: 

Design issues are very difficult to catch. The first place to start is understanding the design of the application. I find it useful to diagram using either UML or a process flow diagram, anything works that communicates the design and working for the application.

From there I go into more detail, and ask myself the questions "Would I have done it this way", what other options are there. It is easy to see code-debt, i.e. the debt that we get from making bad choices, as always bad, but sometimes there are other factors involved like budget, time, availability of resources etc. Their you have to ask the question if it is worth refactoring a working but bad designed application.

If there are many upcoming new features, changes, bug fixes, etc I would say it is good to refactor, but if the application rarely changes and is stable, then maybe leaving it as is is a better approach.

Another sidepoint to note, is that if the code is used by another application as a service or module, then refactoring might first mean create a stub around the code that servers as the interfaces, once that is defined clearly and has unit test to prove it work. You can choose any technology to fill in the details.

mrjohn
+13  A: 

First, don't fix what isn't broken. As long as the system you are to take over works, leave functionality alone.

The system is obviuosly broken when it comes to maintainability, however, so that is what you tackle. As mentioned above, write some tests first, get the source backed up in a cvs, and THEN start by cleaning up small pieces first, then the larger ones and so on. Do NOT attack the bigger architectural issues until you have gained a good understanding of how the system works. Tools won't help you as long as you don't dive into the code yourself, but when you do, they do help a lot.

Remember, nothing is "perfect". Don't over-engineer. Obey the KISS and YAGNI principles.

EDIT: Added direct link to YAGNI article

lbruder
+1  A: 

Firstly, let me say that Working Effectively with Legacy Code is probably a really good book to read, judging by three answers within a minute of each other.

  1. bad database table design

This one, you are probably stuck with. If you try to change an existing database design you are probably committing yourself to redesigning the whole system and writing migration tools for the existing data. Leave well alone.

JeremyP
well, I think bad coding is easier to change than a bad database design. I will have a nightmare to migrate the existing records into the new table. most probably, i wont touch it.
liangteh
I've succesfully applied refactorings to code+database at the same time using data migration scripts. Not easy but it can be done, especially if you write some good database integrity tests.
mikera
Oh, I'm not saying it *can't* be done, just that it is likely to be very hard.
JeremyP
BTW I just bought the book mentioned on Kindle. I think three people recommending it simultaneously means it is probably worth a read.
JeremyP
+8  A: 

Your issue #7 is by far the most important. As long as you have no idea how the system is supposed to behave, all technical considerations are secondary. Everyone is suggesting unit tests - but how can you write a useful test if you can't distinguish between wanted and unwanted behaviour?

So before you start touching the code, you have to understand the system from the user's point of view: talk to users, observe them using the system, write documentation on the use case level.

Yes, I am seriously suggesting that you spend days, more likely weeks, without changing a single line of code. Because right now, any change you make is likely to break things without you realizing it.

Once you understand the app, you'll at least know which functionality is important to test (manually or automated).

Michael Borgwardt
well, time is mainly spend on resolving issues rather than writing documents. I have take steps to make use of static code analysis tools such as PMD, FindBugs to highlight the code issues.
liangteh
@liangteh: well, my point is that with no requirements documentation at all, worrying about technical issues is like being on a ship in the middle of the ocean and worrying about the fuel efficiency of your engine when you have no idea where you are, no maps, and no means of navigation.
Michael Borgwardt
I'd investigate how other systems are integrating/interacting with this one too, as part of the system documentation, since someone mentioned in another comment, other programs may even be depending on certain BUGS in your program. IMHO system integration quirks can often cause catastrophic problems compared to something that a human user might have noticed behaving strangely (IE you could notice it during testing)
Jake
+5  A: 

Focus on stability first. You can't enhance or refactor until you have some kind of stable environment in-place around the application.

Some thoughts:

  1. Revision control. You've made a start by setting-up subversion. Now make sure that your database schemas, stored procedures, scripts, third-party components, etc. are under revision control too. Have a version labelling system, make sure you label versions and can accurately access old versions in the future.
  2. Build and release. Have a way to build stable releases on a machine other than your dev machine. You may want to use ant/nant, make, msbuild, or even a batch file or shell script. You may need deployment scripts / installers too if they don't exist.
  3. Get it under test. Do not change the app until you have a way to know whether your change has broken it. For this you need tests. You should hopefully be able to write xunit unit tests for some of the simpler, stand-alone classes, but try to build some system/integration tests that exercise the application as a whole. Without high code coverage (which you won't have to begin with) integration tests are your best bet. Get into the habit of running the tests as often as possible. Take every opportunity to extend them.
  4. Make small, focussed changes. Try to identify systems/subsystems within the application, and improve the boundaries between them. This reduces the knock-on effects of changes you may make. Beware the temptation to "pretty-up" the code by reformatting it or imposing the latest fashionable design pattern. Turning-around a system like this takes time.
  5. Documentation. Its necessary, but don't worry too much about it. System documentation is rarely used in my experience. Good tests are usually better than good documentation. Concentrate on documenting the interfaces between the application and the system context that it runs in (inputs, outputs, file structures, db schemas, etc).
  6. Manage expectations. If its in bad shape then it will probably resist your efforts to make changes and timescales may be harder than usual to estimate. Make sure management and stakeholders understand that.

At all costs, beware the temptation to just rewrite the whole thing. Its almost never the right thing to do in this situation. If it works, concentrate on keeping it working.

As a junior developer, don't be afraid to ask for help. As others have said, Working Effectively With Legacy Code is a good book to read, as is Martin Fowler's Refactoring.

Good luck!

Andy Johnson
system is quite old, working with ejb 2.0 and some dao objects. sigh
liangteh
I've been there too, and I feel your pain.Try to treat taking over a system as an opportunity. Its (presumably) in use and performing some business function. Really learn it and become the design authority. You'll be noticed.
Andy Johnson
i have took over a couple of bad projects, but this is the worst to date.
liangteh
+1  A: 

My standard answer to this question is: Refactor the Low-hanging Fruit. In this case, I'd be inclined to take one of the 10K-line classes and seek out opportunities to Sprout Class, but that's just my own proclivity; you might be more comfortable changing other things first (setting up source control was an excellent first step!) Test what you can; refactor what can't be tested, take a step at a time, and make it better.

Keep in mind as you progress how much better you are making things; if you concentrate only on how bad things still are, you're likely to become discouraged.

Carl Manaster
you are spot on, with all the on-going issues identified I am already discourage. however, I will bear in mind things are going to get better as the issues reported shall get resolve 1 by 1 as time goes by.
liangteh
A: 

As others have noted, don't change something that works just to make it prettier. The risk that you will introduce errors is great.

My philosophy is: As I have to make changes to satisfy new requirements or to fix reported bugs, I try to make the piece of code that I have to change a little cleaner. I'm going to have to test the changed code anyway, so now is a good time to do a little clean-up at small additional cost.

Fundamental design changes are the toughest and must be saved for occasions where you have to make a big enough change that you would be testing all the changed code anyway.

Changing bad database design is hardest of all because the poorly designed tables are likely used by many programs. Any change to the database requires changing every program that reads or writes it. The best way to accomplish this is usually to try to reduce the number of places that access any given part of the database. To take a simple example: Suppose there are 20 places that read through customer records and calculate the customer account balance. Replace this with one function that reads the database and returns the total, and twenty calls to that function. Now you can change the schema for the customer records and there is only one piece of code to change instead of 20. The principle is simple enough, but in practice it is unlikely that every function that accesses a given record is doing the same thing. Even if the original programmer was clumsy enough to write the same code 20 times (not unlikely -- I've seen plenty of that), the real situation is probably not that he wrote 1 function 20 times, period, but that he wrote function A 20 times, function B 12 times, function C 4 times, etc.

Jay
A: 

Try to create some unit tests first that can trigger some actions in your code.

Commit everyting in SVN and TAG it (in case that something goes bad you'll have an escape pod).

Use inCode Eclipse plugin http://www.intooitus.com/inCode.html and look for what refactorings it proposes. Check if the refactorings proposed seem ok for your proble. Try to understand them.

Retest with the units created before.

Now you can use FindBugs and/or PMD to check for other subtle issues.

If everything is oka you might want to check-in again.

I'd also try reading the source in order to detect some cases where patterns can be applied.

Daniel Voina
A: 

I like this discussion.

Meteor
then vote it up and ADD to the discussion. "I like this" is for facebook.
WernerCD