tags:

views:

204

answers:

9

I am a novice programmer and as a part of my project I have to modify a open source tool (written in java) which has hundreds of classes. I have to modify a significant part of it to suit the needs of the project. I have been struggling with it for the last one month trying to read code, trying to find out the functionalities of each class and trying to figure out the pipeline from start to end.

80% of the classes have incomplete/missing documentation. The remaining 20% are those that form the general purpose API for the tool. One month of code reading has just helped me understand the basic architecture. But I have not been able to figure out the exact changes I need to make for my project. One time, I started modifying a part of the code and soon made so many changes that I could no longer remember.

A friend suggested that I try to write down the class hierarchy. Is there a better(standard?) way to do this?

A: 

The only way to understand code is to read it. Keep working that is my advice.

There are projects with better documentation than others. Here is a couple of projects that I know are well organized: Tomcat , Jetty, Hudson,

You should check java-source for more open source projects.

Dani Cricco
+9  A: 
  • check in the code in some source code repository (Subversion, CVS, Git, Mercurial...)
  • make sure that you can build the project from the source and run it
  • if you already have an application that uses this open source tool try removing the binary dependency and introduce project dependency in eclipse or any other IDE. run your code and step through the code that you want to understand
  • after every small change commit
  • if you have different ideas branch the code
Boris Pavlović
+1 for version control. Also, don't use CVS. Just don't.
Justin K
+1  A: 

My friend, you are in deep doodoo. Modifying large, badly documented legacy code is one of those projects that makes experienced programmers seriously contemplate the joys of selling insurance, or some other alternative career. However it isn't impossible, and here are some tips that I hope will help.

Your first task is to understand the code as much as possible. You are at least on the right track there. Getting a good idea of the class structure is absolutely important, and a diagram is probably the best way. The other thing I would suggest is that when you find out what a class does, add the missing documentation yourself. That way when you come back to it you wont' have forgotten what you found out.

Don't forget the debugger. If you want to find out what is really going on, stepping through the relevant code, or simply finding out what a call stack really looks like at a certain point can be very helpful.

DJClayworth
And when the documentation is done, consider contributing it back to the original project.
JeremyP
+1  A: 

Personally I think it is very difficult to try to understand an entire application all at once. Instead, try to focus only on certain modules. For example, if you can identify a module that you need to change (e.g. based on a screen, or certain input/output point), then start by making one small change and testing it. Go from there, making a small change, testing, and moving on.

Additionally, if your project has unit tests (consider yourself lucky) and review the unit tests of the module you are focusing on. That will help you get an idea of what the module is expected to do.

MikeG
A: 

I think that digging in the project isn't a god approach. Most of the open source projects are complex, have thousands of lines of code, and you can't imagine understanding the general architecture at once.

An approach I like to use is to start very small, and extends my knowledge about the global project step by step.

Try by starting with a little, or even minor, modification in the code base. A little improvement in an identified class, that seems easy to understand at first. Then, try a bigger modification in the code base, extending your knowledge of the tool a little further. Then, step by step, you should be able to draw the whole picture of the project.

This approach can take a lot of time, but in the end, you can hope to understand all vital parts of the application.

Vivien Barousse
+2  A: 

Two things that Eclipse (and other IDEs as well) offer to 'fight' this. I've used them on very large projects:

  • Call hierarchy - right-click a method and choose "call hierarchy", or use CTRL + ALT + H. This gives you all methods that call the selected method, with option to check further down the tree. This feature is really very useful.

  • Type hierarchy - see the inheritance hierarchy of classes. In eclipse it's F4 or CTRL + T.

Also:

  • find a way to make so that changes take effect on-save, and you don't have to redeploy
  • use a debugger - run in debug mode, within the IDE, so that you see how the flow proceeds
Bozho
A: 

In my opinion there is no standard approach to understand a project. It depends on many factors, from the understandability of the code/architecture you're analyzing to your previous experience on large projects.

I suggest you to reverse-engineer the code by using a modeling tool, so that you can generate some UML models from the existing source code. These diagrams can be helpful as a graphic guideline during your anaysis of the code.

Don't be afraid to use debugging to grab the logic of the most complex functionalities of the project. Running the most complex code instruction by instruction, seeing the exact values of the variables and the interactions between the objects can be helpful.

Before you refactor to change the project to suit your needs, be sure to write some test cases, so that you can verify that your modifications don't break the code in unexpected ways.

frm
A: 

Here are a couple recommendations

  • Get the code into some form of CVS. This way if you start making changes you can always look back at previous versions.
  • Take the time to document what you have already learned/gone through. Javadoc is fine for this.
  • Create a UML structure for you code. There are lots of plugins out there and wil give you a nice representation of your code layout.
Sean
+7  A: 

There's a great book called Working Effectively with Legacy Code, by Michael Feathers. There's a shorter article version here.

One of his points is that the best thing you can do is write unit tests for the existing code. This helps you understand where the entry points are and how the code should work. Then it lets you refactor it without worrying that you're going to break it.

From the article linked, the summary of his strategy:

1. Identify change points
2. Find an inflection point
3. Cover the inflection point
   a. Break external dependencies
   b. Break internal dependencies
   c. Write tests
4. Make changes
5. Refactor the covered code.
JacobM