I'm interested in a specific Java-based project and want to understand how it really works. My main problem is that there are different versions of documentation which are inconsistent. The source code is mainly not commented and has nearly 150 kLOC (according to sloccount
). So I have not idea where I should start. What would you recommend? Where do you start reading code when you enter a new project?
views:
286answers:
8I usually read just enough code to find an entry point, set a breakpoint, and start stepping through the code.
Nothing helps me understand a program better than watching it execute its logic step by step and inspecting the variable values along the way.
Remember: The Debugger is a developer's best friend!
I would first navigate into some page that I find important and then start debugging from there.
I'd suggest to generate and print an UML class diagram of all that code so that the relationships are quickly clear. Then you can start studying the source, starting with the top level classes.
Most IDE's have builtin UML generators or they can be added as plugins. E.g. Eclispe eUML2. In most IDE's you can also Ctrl+Click
or debug your way through the classes and methods.
start with the unit tests, if they exist. Testing is a good way to document what the code does.
If testing doesn't exist, you are kind of screwed. I would run the code locally (if possible/applicable) and in a debugger try to trace the execution paths vertically. So if you are in a web app, put a break point somewhere in your request handler, then step into the methods, to see what each path of execution does.
Most of the time I poke around, and try to resolve some bugs. Even if the documentation is deprecated/inconsistent keep it next to you it can be useful.
If the project as a bug report system try to find a simple bug and correct it. The more you'll modify the code, the more you'll learn it.
I've used the Understand tools for this in the past. I only have experience with the C++ one, but it was quite excellent.
Beyond that, the best way to get "in" to a body of code is to try to make a change. Come up with something that you need to do and poke around until you find the right place to make that change. Do that a few times to get a good "feel" for how the code is structured, and you'll come to better understand what's going on.
150,000 lines of mostly uncommented code with dodgy documentation....
Well, once the urge to kill myself subsides, I'd try to break it down into manageable pieces. In a professional setting, I'm typically tasked with fixing/changing/upgrading one thing, so I'll try to just understand how that corner of the code works.
So first, refine your focus. Choose one part of the code that seems particularly interesting, or important, or where the documentation doesn't suck. Try and figure out how that works, by ... well, whatever works for you. Walking through it with a debugger, or lacing it with print statements, or just reading the code and thinking really hard. Whatever. Then once you feel like you have a handle on that, move to an "adjacent" piece and do it again. Repeat until you feel you have some notion of what's going on.
Get in touch with people who would know why the docs are inconsistent. Check the dates on the docs. The fact that it is not commented can actually work in your favour. If the docs are inconsistent then comments could have potentially compounded the problem, that is, they could have been badly maintained comments. Understand the program by first understanding what it is supposed to do, in other words, what the documenters should have written if it were well documented. If nobody knows these answers, start charging by the hour so that project sponsors have an incentive not to let such a situation arise in the future. There's not much you can do about the past mistakes of sponsors.