ansaurus

Question

Advice for dealing with code maintenance

Answer 1

+3 A:

In my opinion the best thing you can do is tell the professor that the code is a mess, and will not work as expected unless some parts are rewritten. Suggest a small code refactoring, rewriting only the parts that need it (not the entire program).

Not telling this, can give you more problems than you already have :)

Matias Valdenegro 2010-08-05 18:47:00

Answer 2

+15 A:

Refactor the code, one change at a time, until it makes sense to you. The most important thing is not the exact coding strategy, but that you understand what it's doing and why.

You're probably going to end up touching every line of code, so don't try to do things in any particular order - fix the first thing that jumps out at you as wrong first, then go on to the next one, and so on. Exception: if the previous person didn't use a consistent code formatting strategy, run the whole thing through an autoindenter as your first action.

Write a test suite as you go.

Zack 2010-08-05 18:47:29

Thanks Zack for giving me confidence to do this in one of my own projects. +1 in particular for "The most important thing is not the exact coding strategy, but that you understand what it's doing and why"

fearoffours 2010-08-06 07:20:21

As a code indenter: Artistic Style is free and quite easy to use: http://astyle.sourceforge.net/

Matthieu M. 2010-08-06 08:31:53

Answer 3

+22 A:

First, bad code is bad code. Python, Java, Matlab or anything else garbage collected doesn't == good code. You could just as easily be spending your time debugging bad Python code.

With that said, be absolutely upfront with the professor. Tell her the code is bad and show her some examples. The worst thing you can do is to try to hide the problem. If you do that, the problem is not only sure to land in your lap, but it's also sure to be blamed on you.

Figure out what the best solution is and propose it to her. It might be fixing that code, enlisting the help of the computer science department, or rewriting it, either in C++ or another language. Chances are good that she doesn't have any other options and will gladly accept whatever solution you propose.

Wade Williams 2010-08-05 18:49:30

And I agree with Zack - add unit tests to the code as you change it so you know your changes produce the desired result and will continue to do so.

Wade Williams 2010-08-05 18:50:40

Thanks guys for the advice, I'm emailing the professor right now (and trying to be gentle about it). I have to say though, it's much easier to write horribly buggy code in C++ as a beginner than any other language... I'll update this question soon

Xzhsh 2010-08-05 19:02:12

@Xzhsh: C++ is a very expressive and powerful language, enabling you to do, rapidly and accurately, things that really, really should never be done.

David Thornley 2010-08-05 19:44:00

I disagree. Professors typically don't care if code is bad or good only whether it works or not. Step one, just fix up the memory leaks. Step two, redesign the code from scratch. It is only 5000 lines you should be able to do that in a weekend.

ceretullis 2010-08-05 19:51:05

@David, Haha, agreed. And I'm fairly sure that the professor is just going to want stuff to work since she's really not even a CS professor, and I'm just a undergrad doing this for a summer job, but I guess it never hurts to try. I'm waiting on a response right now

Xzhsh 2010-08-05 19:57:45

Well updated question. Issue is resolved, although I'm not sure in the best way haha. Thanks for the advice guys

Xzhsh 2010-08-05 20:55:55

Answer 4

+1 A:

As a stop gap, you could try using some sort of garbage-collected memory management implmementation like dlmalloc to see if that allows you to temporarily get over the out of memory hurdle.

In the longer run, you will have to sort out the memory management mess - there is simply no way around that. From your description, you'll probably also have to address the object design issues, but you can probably get away with doing some memory management cleanup first.

Here's the way I'd approach this:

Find all places in the code that call malloc(). Have a very good look at them and try to replace them with calls to new so you're only having to deal with a single type of memory management, which will make it easier to go to the next step.
Find all the places that new is called and see if you can replace the raw pointers that the results of new are assigned to with boost::shared_ptr<>. This should at least give you some sort of poor man's garbage collection; of course you should also change all the function prototypes of the functions receiving these pointers to take shared_ptr<> instead of raw pointers, otherwise you've just dressed up the mess and have not really improved anything. I'd actually argue that you've changed things for worse...

Once you've addressed the immediate memory management issues, you can start refactoring the objects and improve the design. I wouldn't start by fixing the design first, you're better off getting the software working properly, gaining more understanding of its workings and then sort out the design, otherwise you're likely to replace one mess with another one.

Timo Geusch 2010-08-05 18:50:38

Answer 5

+1 A:

Sounds like a complete mess. A refactoring is the least you'll need to do. The mixture of new's and malloc's is a recipe for disaster!

I think you'll have to probably rewrite almost the whole thing. Luckily 5000 SLOC isn't huge and shouldn't take too long.

Sounds like a good exercise!

Starkey 2010-08-05 18:54:08

Answer 6

+6 A:

I would suggest following some of the steps that Michael Feathers outlines in his 'Working Effectively with Legacy Code' book. Namely: Get the code under test as soon as possible.

Having unit tests that exercise the functionality will give you something you can freely refactor without fear.

I know, however, that it's much easier to say this, than to actually DO it. Read this chapter on Seams (parts of code you can override/hook to allow you to get code under test easier): http://www.informit.com/articles/article.aspx?p=359417&seqNum=3 See also this on CppUnit and CppUnitLite, a framework for unit testing in C++: http://c2.com/cgi/wiki?CppUnitLite

Run the code through a memory profiler. See this SO link: http://stackoverflow.com/questions/818673/memory-profiler-for-c This will help you track down the places you need to start putting delete/free statements. I would also start trying to make the memory allocation at least consistent in the API it uses.

Kilanash 2010-08-05 18:59:00

Answer 7

+10 A:

Be honest. Tell your professor immediately that the code is crap and why so. If what you listed in your question is true, the code is crap indeed.

How much time do you have at hand? Do you understand the implemented algorithm(s)?
5kLoC isn't all that much, especially when it's all low-level fiddling. When you know the underlying algorithms and have enough time at hand, re-writing it to 2k lines of easily understandable code might be better than trying to fix it.

sbi 2010-08-05 18:59:06

I agree with the rewriting approach. When code is that bad, it may be easier to keep the original as is and rewrite a functionally equivalent piece of code. Makes for easier testing too since you can check the consistency of the results between the two versions.

Matthieu M. 2010-08-06 08:33:07

Answer 8

+1 A:

Explain to the professor the issue that you see. And never present a problem without a solution. Keep that in mind as you write up your report along with your suggestions. You can offer multiple possibilities and let them choose the one that they wish to take. And at that point you will have done your job -- time for them to do theirs!

NinjaCat 2010-08-05 19:04:09

Answer 9

+6 A:

5000 LoC isn't too bad. Count yourself lucky.

Since it sounds like memory management is one of the biggest issues, I would start there.

Replace every use of malloc with new.
Fix compiler warnings.
Replace arrays with vectors (or more appropriate data structures).
Replace raw pointers with stack variables/references when possible, and smart pointers when not. Don't worry about major architecture changes or a 100% conversion -- focus on the low hanging fruit and clean up some of the major memory leaks.
Begin rearchitecting the application.
1. Pick a discrete behavior/task in the application.
2. Figure out roughly how it does work.
3. Figure out how you want it to work (concentrating on the interface).
4. Develop a transition plan.
5. Execute your plan.
6. Repeat.

Steve S 2010-08-05 19:09:34

Answer 10

+1 A:

Many good suggestions on how to deal with the code itself have been given. It sounds to me like one of the underlying problems may be that this was (understandably) written for a non-computer class by a non-programmer. You might suggest to your prof that in the future she get someone in the programmer stream classes (in cooperation with a cs prof) to do the actual implementation in these cases.

Now, of course this has an upside and a down side for everyone involved. The end user (your prof) has to be able to create a spec or at least articulate her needs. She may need to be convinced that there is value in taking the time away from what she would rather be doing in order to do this. The implementation team has to be able to work to that spec and communicate with the end user(s). The implementation team will ensure that the code is reasonably well written if they are in fact a team rather than an individual.

This is all excellent preparation for the 'real world' as well as an opportunity for interdisciplinary cooperation, with those who understand the problem domain working with those who understand the tools.

It offers the cs student(s) a 'real' rather than a contrived exercise, and should generate more enthusiasm on their part. (Of course, depending on departmental rivalry and small-p politics, it could also be a boondoggle in the making or a non-starter, but hey, I'm an optimist...)

mickeyf 2010-08-05 19:36:56

This doesn't answer the question... knowing how it should have been done right, doesn't help that much :-)

mxp 2010-08-05 21:18:20

Absolutely correct. It does not answer the question, and I apologize. However, I think being able to learn from the mistakes of ourselves or others and to see beyond the immediate problem does help. Simply answering it (as many have done better than I could) might have solved it once, 'knowing how to do it right' (if that's what my humble suggestion is ;>) might prevent it from recurring multiple times in the future. It's still not too late to make it a cross department team effort that could benefit a lot of people, rather than a painful solo effort that helps only the original end user.

mickeyf 2010-08-05 23:24:18

Answer 11

A:

Keys to maintenance:

SCM. On Windows I would recommend something like Mercurial. Make central blessed repository where from others can get the software and where to fixes get committed from developer private repositories.
Issue/bug tracking. If you do not have one yet, get it. That should be an easy sell to the prof. Can't give any recommendation for Windows. Friends are using Mantis.
Release management. Often part of the issue tracking system. But the process of making a release itself is more of the organizational and political, not technical, issue. Essentially it is about giving public name to a certain internal versions of the software. In real life it gets highly political when it is being decided what actually gets into the release.

5K LoC isn't much. Having SCM helps to track the changes and go back in case of regressions. Having issue tracking help the team to collaborate on problems. Make minor releases when you fix problems (e.g. your malloc vs. new vs. memory leaks). Make major releases for the bigger fixes and features (e.g. support for larger data sets).

Important is to not to rush and start with small changes. Modern SCM allow for lots of small commits (ditto issue tracking with issue dependency trees) and one should use that. That allows to track how behavior of the software changes and what change precisely introduced the regression. Quite often it is the most innocent looking change.

Dummy00001 2010-08-05 21:26:59

Answer 12

+1 A:

There's nothing to be gained by hiding the problem from the prof. Yes, complaining that you were given a load of junk to work on could sound like whining, but hiding the fact and trying to fix it on the side will make it look like you are at best a slow worker and at worst incompetent. I think the first thing to do is to try to politely tell the prof that there are a lot of problems with this program that will need to be cleaned up and give examples ... does the prof know anything about programming? I don't think that was spelled out in the question ... but one way or another explain the problem and say what you think it will take to fix it.

That said, the next big question that come to my mind is, Are the poor memory management and the bad type conversions the only big issues, or are those just examples you picked from a hundred serious problems? That is, is the basic structure and logic of the program basically sound, or is the whole thing a mess? (I'd guess it's all a mess, but obviously I haven't seen the program.)

The key decision to be made here is, Do you (a) Go through the program and fix a bunch of detail errors one at a time? (b) Do a serious restructuring and cleanup? Or (c) Throw it away and rewrite from scratch. I'm often tempted to choice (c) in cases like this because it seems like rewriting would be less work than cleaning up the mess, but it is likely that there are many many detailed logic decisions embedded in the existing code. Unless you have detailed, up-to-date requirements documentation -- which is such an unlikely possibility that I will dismiss it instantly amidst gales of laughter at the very thought -- the only way to know all the rules is to read the existing code and figure out what it's up to. At which point (a) or (b) become much more efficient than (c).

Wish I could say "just do X", unfortunately with the amount of information that you can reasonably put into a post, I think we're all limited to saying "Here are some plausible options to consider ..."

Jay 2010-08-05 22:34:07

Answer 13

A:

This problem could easily be fixed by adding more RAM. 1-2GB should do the trick.

moreram 2010-08-06 14:43:46

Answer 14

A:

Replace every:

Type* data = (Type*)malloc(123*sizeof(Type));

with

std::vector<Type> data(123);

All memory management removed! Easy as pie. Then just pass around the vectors by value.

Inverse 2010-08-09 02:18:40

Ah, if only this sort of problem were that simple ;-)

Zack 2010-08-10 21:42:04

ansaurus

tags:

views:

answers:

Advice for dealing with code maintenance

related questions