views:

286

answers:

8

I'm interested in a specific Java-based project and want to understand how it really works. My main problem is that there are different versions of documentation which are inconsistent. The source code is mainly not commented and has nearly 150 kLOC (according to sloccount). So I have not idea where I should start. What would you recommend? Where do you start reading code when you enter a new project?

+6  A: 

I usually read just enough code to find an entry point, set a breakpoint, and start stepping through the code.

Nothing helps me understand a program better than watching it execute its logic step by step and inspecting the variable values along the way.

Remember: The Debugger is a developer's best friend!

Justin Niessner
You can always "step into" a method to learn more about its inner workings, or "step over" ones you don't care about.
Colin O'Dell
@Justin Niessner: not working when you want to trace a multi-threaded program like, say, IntelliJ IDEA. There's been an article about it lately: someone trying to "figure out" IntelliJ IDEA from the source. Techniques out of the eighties like setting breakpoints and using the debugger to step through the program only worked for... The bootstrapping part ;) Remember: multi-threaded program are a debugger's worst friend.
NoozNooz42
@NoozNooz42 - You're right, but the OP didn't ask for a be all end all solution. He asked where to start. It may be a technique out of the eighties, but it's still a good **start**.
Justin Niessner
I will of course try your solution. It sounds reasonable. lets see how far I will come. :)
qbi
What debugger would you recommend for Java stuff?
qbi
A: 

I would first navigate into some page that I find important and then start debugging from there.

Guilherme Oenning
+5  A: 

I'd suggest to generate and print an UML class diagram of all that code so that the relationships are quickly clear. Then you can start studying the source, starting with the top level classes.

Most IDE's have builtin UML generators or they can be added as plugins. E.g. Eclispe eUML2. In most IDE's you can also Ctrl+Click or debug your way through the classes and methods.

BalusC
Either that or make one yourself. +1 for mentioning uml
monksy
+2  A: 

start with the unit tests, if they exist. Testing is a good way to document what the code does.

If testing doesn't exist, you are kind of screwed. I would run the code locally (if possible/applicable) and in a debugger try to trace the execution paths vertically. So if you are in a web app, put a break point somewhere in your request handler, then step into the methods, to see what each path of execution does.

hvgotcodes
Well, a small portion of the code has unit tests, some has tests like "if it succeeds print Hooray else error". The test coverage seems not so high.
qbi
@qbi, then maybe start writing some unit tests. Or I would go the 'step thru the code with debugger' route.
hvgotcodes
+3  A: 

Most of the time I poke around, and try to resolve some bugs. Even if the documentation is deprecated/inconsistent keep it next to you it can be useful.

If the project as a bug report system try to find a simple bug and correct it. The more you'll modify the code, the more you'll learn it.

Colin Hebert
+1  A: 

I've used the Understand tools for this in the past. I only have experience with the C++ one, but it was quite excellent.

Beyond that, the best way to get "in" to a body of code is to try to make a change. Come up with something that you need to do and poke around until you find the right place to make that change. Do that a few times to get a good "feel" for how the code is structured, and you'll come to better understand what's going on.

dash-tom-bang
I tried PMD (http://pmd.sourceforge.net/), but several rulesets (basic, unused etc.) spit out some 200kB HTML files. So it would be much work to walk through those files.
qbi
Well there's no getting around that fact; there's a lot of stuff in that codebase, so you're going to have to use whatever tools you can come up with to give yourself views into it.
dash-tom-bang
+3  A: 

150,000 lines of mostly uncommented code with dodgy documentation....

Well, once the urge to kill myself subsides, I'd try to break it down into manageable pieces. In a professional setting, I'm typically tasked with fixing/changing/upgrading one thing, so I'll try to just understand how that corner of the code works.

So first, refine your focus. Choose one part of the code that seems particularly interesting, or important, or where the documentation doesn't suck. Try and figure out how that works, by ... well, whatever works for you. Walking through it with a debugger, or lacing it with print statements, or just reading the code and thinking really hard. Whatever. Then once you feel like you have a handle on that, move to an "adjacent" piece and do it again. Repeat until you feel you have some notion of what's going on.

BlairHippo
+1 focus on the piece(s) you are most interested in.
eglasius
A: 

Get in touch with people who would know why the docs are inconsistent. Check the dates on the docs. The fact that it is not commented can actually work in your favour. If the docs are inconsistent then comments could have potentially compounded the problem, that is, they could have been badly maintained comments. Understand the program by first understanding what it is supposed to do, in other words, what the documenters should have written if it were well documented. If nobody knows these answers, start charging by the hour so that project sponsors have an incentive not to let such a situation arise in the future. There's not much you can do about the past mistakes of sponsors.

broiyan