views:

1103

answers:

20

Situation: I have some source code, and I'm boning up on what happens. Who calls what, what goes where, etc, etc. (There is minimal documentation/comments, so I have to work it out myself).

Now, my usual solution is to fire up a splitscreen text editor and stare at the code until I understand it, or maybe print it and write all over it with colored pens.

This is a horribly slow and inefficient way to go about it, I think.

What do you do/what tools do you use in these situations?

+1  A: 

You tagged this as Language-Agnostic, but it really isn't. Depending on the language and how much intorspection/reflection the language provides along with what debugging tools are available for the language, the methods can vary greatly.

EBGreen
noted, edited. Can you comment on the tools/methods for the languages you know?
Paul Nathan
A: 

In case of lacking documentation

Staring at the code is usually what I do as well. Now I do most of my development in visual studio so i'd throw in some breakpoints at points of interrest (and run the application) to get a better feeling of the flow.

Would be cool with a program that could analyze your sourcecode to generate a visual repentation of the flow, I haven't discovered any though.

thmsn
doxygen helps track function/class references and calls.
Paul Nathan
+1  A: 

Right now, this is my main job. I have a lot of code to review coming from outsourced developers in India. sometimes it's absolutely horrid and obfuscated. But lately, it's been at an acceptable level of clarity.

Minimal documentation/comments is probably not good. I do actually use a lot of highlighters and colored pens sometimes. But following the flow and commenting as you go, and then as you understand it more as your going through the code, going back and fixing up the comments, will take you a long way. If the comments and documentation is not acceptable to allow for a good understanding, then your process of documenting the code will help out a lot in your own understanding.

stephenbayer
+2  A: 

Go back to the roots - draw class diagrams (you could use pen-and-paper solution or something more automated depending on your source code and tools available to you).

When you have good understanding of the class model - do the flowcharts (UML activity diagrams) for the most complex and puzzling methods. As was mentioned in the comment to this reply this does not work for everyone - some people find it easier to read code. Some (like me) are better with the visual representations.

When you have enough understanding of the principal workflows you might spend some time commenting the code itself making it easier to read.

For some languages you could auto-generate a code documentation. Even if there is no comments it is usually possible to at least understand the size and complexity of the files, classes and methods

Ilya Kochetov
How useful class diagrams are depends on the code; how useful flowcharts are depends on your brain, as frankly I find they're almost always harder to understand than code.
Mark Baker
A: 

First, you need a call graph/object model/other relationship diagram that displays the relationship between large/major chunks of the code in question. Depending on the language, it could be obtained either statically (without running the code) or from profiling the running code. Use something like Graphviz to produce a readable layout of the call graph.

If you are profiling (that is, your language has appropriate tools), you could look for code coverage analysis as well - to find and eliminate (potential) dead code regions.

Then, you might be better off with littering your code with LOTS of debug output statements (sometimes, you could just search-and-replace-with-regex a whole codebase to produce debug output every time particular method is called or particular var is assigned). Write comments while reviewing.

Only then it is time for split-screen editor and colored markers.

ADEpt
+1  A: 

Generally I find it easier to digest new code if I understand the problem domain, having an idea what it's trying to solve helps me immensely.

If it has tests then you may be lucky, stepping through the code may ultimately be your only choice!

P.S. Many IDEs can generate class diagrams for you saving you the effort.

P.P.S. As for code coverage etc... I've found Sonar to be quite good.

Jonathan
Also, if there are no tests, writing your own may help your understanding.
Jonathan
+7  A: 

You might check out this related post: CScope for other languages

Or perhaps try

Hortitude
A: 

Understand by SciTools has been made for exactly this purpose. I haven't used it myself but I've heard some good things about it.

David Holm
A: 

Do you have a source control repository, and in particular, do you have one that is linked to a CRM/Support system? If so, it is much easier to get a handle on the system - call up the change tree, tie the code change to its functional implications, and it quickly becomes apparent what, if any, organizational principles dominate a particular section of code.

Another useful tool in the case of object-oriented code is a UML suite with a code->diagram generator and some level of zoom. The object relationships can help you get a feeling for the "underfactored" and generally troublesome code - a dense star of connections, for example, tends to indicate a point worthy of further inspection.

Mike Burton
Unfortunately, I was handed what amounted to a zip file with the source.
Paul Nathan
+1  A: 

Generally, I walk through code running it in a debugger to get some handle on it assuming it isn't huge like a 1001 page web site that may be annoying to go through each page to see what it does and how it does what it does.

Another point though is to know if there is a bug I'm fixing or feature I'm adding so that I can concentrate on a point in the code rather than trying to get it all in my head at once that can be rather taxing on the mind.

Lastly, don't forget to look at the big picture of what is going on in each tier, assuming you can divide things up this way so that there is a UI tier, a middleware tier, a database tier, etc. to help simplify things. Also, if there is source control, how is it used and could there be a better way to do this are where I'd be.

JB King
A: 

Enterprise Architect from Sparx Systems can be hooked to .NET's debugger and generate profiling data _AND_ALSO_ UML Sequence Diagrams. Very useful.

vmarquez
+1  A: 

You may want to consider looking at source code reverse engineering tools. There are two tools that I know of:

Both tools offer similar feature sets that include static analysis that produces graphs of the relations between modules in the software.

This mostly consists of call graphs and type/class decencies. Viewing this information should give you a good picture of how the parts of the code relate to one another. Using this information, you can dig into the actual source for the parts that you are most interested in and that you need to understand/modify first.

Doug
A: 

In addition to OpenGrok, which allows you to navigate throughout your codebase very easily, You can try Ack.

It's a Perl script that can be used as grep (it has a lot of grep options) but also has a lot of specialized options dedicated to source code search.

They do not play in the same league, however, OpenGrok scales better as it scans the codebase only once (it can also update its index from the last changes), and then generates an index. On the other hand, similarly to grep, Ack needs to read and parse the source files every time you invoke it.

philippe
A: 

A long time ago, in a land far, far away, we had an opportunity to bid to complete a software project which involved >500K lines of code. Many hours of perusing the code, code walkthroughs, and attempts to map the software failed to give us insight on how the thing worked.

So, we needed a tool to map the software.

First, compile to assembler.
parse the assembled code, extracting labels and references
put the labels and references into a database
then many views to extract the hierarchy
finally, a rudimentary graphics program to visualize the mapping.

That database grew into a valuable tool to gain insight into the structure of the program, which allowed us to create a very detailed quote.

Unfortunately, they didn't go for our quote. It was too high for their budget.

I heard from them again after two more teams had a go at it. AFAIK, it still isn't complete.

dar7yl
A: 

If there aren't unit tests to show what should happen, I start adding them. I do this for a couple of reasons:

  • Test help me understand the code in small provable chunks

  • I am this code becuase someone expects me to modify or fix this code in the future. without the tests I dont have a safety net to know that I am working safe.

This has driven several managers of mine nuts since the initial productivity is low but in the end after I explain that I cannot garauntee that my changes do not break something without them. I usually get left alone after that.

MikeJ
A: 

If I understand your question correctly, the tool you need most is a good cross-referencing editor.

Where I work we use Visual Slick Edit, which in my opinion is a fantastic editor for C/C++ (And probably for other languages as well). You create a workspace, add the entire source code tree to it, wait a few minutes for it to create a tag file - and whops ! For every function call, you can easily just to the definition, or see the list of all other caller for that function. These two features are great for finding your way around a new code base. Is has many more features which are useful for a source-code browsing, but these two basic abilities are what seems to me like the core things you need.

I am told that Eclipse can also handle large C/C++ projects, and offers roughly the same functionality with respect to jumping to definitions/references, but I have not used it personally.

Oren Shemesh
+4  A: 

A free solution: run Doxygen on it with full options (generates highlighted linked source code and relationship diagrams with graphviz). Doxygen will do a pretty good job to extract structures from code without any comment. Sit back and browse the documentation.

ididak
A: 

The best way to deconstruct a code base is to use the tool NDepend. See all features of NDepend here http://www.ndepend.com/Features.aspx:
- Code Query Language (CQL)
- Compare Builds
- 82 code metrics
- Manage Complexity and Dependencies
- Detect Dependency Cycles
- Harness Test Coverage Data
- Enforce Immutability and Purity
- Warnings about the health of your Build Process
- Generate custom report from your Build Process
- Diagrams
- Facilities to cope with real-world environment

Patrick Smacchia - NDepend dev
This comment (he's its author) and the webpage aren't exactly clear, but it looks like it's just for .NET assemblies.
Ken
+1  A: 

For C/C++: Source Navigator NG (Open Source)

none
+1  A: 

Use search tools to find you way around the code. "grep" works but is slow for big source bases and doesn't help you see the code.

Our SD Source Code Search Engine is a langauge-sensitive source code search tool. It can handle many languages at the same time. Searches can be performed for patterns in a specific langauge, or patterns across languages (such as "find identifiers involving TAX"). By being sensitive to langauge tokens, the number of false positives is reduced, saving time for the user. It understands C, C++, C#, COBOL, Java, ECMAScript, Java, XML, Verilog, VHDL, and a number of other languages.

Ira Baxter
+1 very interesting. Do you know how much does it cost?
Vitor Py
Inquire at the web site.
Ira Baxter