How to analyze an open source code without (or with not enough) documentation?

views:

154

answers:

+8 Q:

How to analyze an open source code without (or with not enough) documentation?

I got an open source code, about 15 mb. I want to understand the main algorithm used there. I started analyzing every part of that code, but I think it will take a lot of time. Are there any approaches to make process easier? I didn't do that before, so it is my first experience.

This one, may be someone knows: https://launchpad.net/cuneiform-linux

+1 A:

Profiling the code will show you which routines are important. Look at both the top and bottom 5% by number of calls.

Ignacio Vazquez-Abrams 2010-01-26 13:22:40

Yeah, like seeing that main() is called once and std::string constructor one million times is going to help anyone ;)

Milan Babuškov 2010-01-26 13:30:36

It's the ones called `hash_password()` and `draw_form()` that will matter.

Ignacio Vazquez-Abrams 2010-01-26 13:40:07

I'd suggest ranking them by execution time. Discard any obvious time sinks (constructors, network I/O), and the rest should hopefully be the most important routines. Unfortunately, this doesn't work as well with well-designed OOP code as it did back in the structured design days...

TMN 2010-01-26 13:52:19

+2 A:

As you go, add to the documentation. With any luck there are more people doing the same and between you you will bring the level of documentation up to what is required. That's what open source is all about.

David M 2010-01-26 13:23:25

Thats great! Good idea, Thank you!

maximus 2010-01-26 15:26:58

I joined the project and made an offer to that guys to make together some documentation. Hope that they respond!

maximus 2010-01-26 16:26:36

Good luck, enjoy...

David M 2010-01-26 16:31:30

+7 A:

Use Doxygen. It creates an easily browseable cross-reference of the code base in HTML. And it can also create dependency/class diagrams (if the code is OOP).

The code does not need to have specially formatted comments. Although it does help, Doxygen is smart enough to parse the code and figure stuff out on its own. What I like the most is the ability to click on any function name, variable, class etc. and instantly jump to place where it is declared, defined and show list of all places where it is used. I used Doxygen in the past to chew on some rather large code bases (PHP source code, for example) and it saved me a lot of time.

You can also set up Eclipse CDT and import all source files into a project and get a similar code browser. Although, some stuff like function/class index are not available in that case.

Milan Babuškov 2010-01-26 13:23:47

Add a link to the open source project in your question :-)

Maybe others know it or know alternatives.

Karussell 2010-01-26 13:26:31

OK!) I added it)

maximus 2010-01-26 14:22:06

First thing I would do is figure out what are the main entry points. Most programs have a fairly standard format: first, input checking (make sure you got the right number and type of inputs). Second, pre-processing/preparation (opening files, allocating buffers, initializing data structures). Third, they do whatever it is they do, the main processing routine. After that, it's generally output & cleanup. Of course, these may be intermixed (input checking may involve opening the input file), possibly horribly; like a routine fileAccessible(char *fileName) that opens the file, strips the header, instantiates the parser and initializes the lexer by reading the first symbol and putting it into the scanner table. Thankfully, most open-source projects aren't that messed up, but you have to be ready for anything.

TMN 2010-01-26 13:48:06

+2 A:

Since it's C++ code, you may find Source Navigator useful.

slebetman 2010-01-26 14:01:10

Thanks, I am trying it now.

maximus 2010-01-26 14:22:33

+1 SourceNav will help you navigating easily in the code and will make relationships between different part of the source code much more apparent!

Remo.D 2010-01-26 14:55:07

ansaurus

tags:

views:

answers:

How to analyze an open source code without (or with not enough) documentation?

related questions