views:

582

answers:

11

Possible Duplicate:
What’s the best way to become familiar with a large codebase?

Dear fellow C and C++ programmers, today I got my hands on a new base of code which is around 800,000 lines of code (C and C++ mixed). How do I get familiar with such a huge code? What code browsing tools do you suggest?

So far I found only MXR (Mozilla Cross Reference Tool) and DXR (improved MXR), but I am still figuring out how to apply them to my source code.

What other tools can you suggest?

Thanks, Boda Cydo.

+14  A: 

Doxygen is a great tool for becoming familiar with existing code.

Sam Miller
+1: seconded here
eruciform
Make sure you turn on graphviz with Doxygen since it generates some decent graph.
Dat Chu
+1, I'm always using Doxygen for this. For a quick start, take the default Doxyfile and enter these settings: EXTRACT_ALL=YES SOURCE_BROWSER=YES INLINE_SOURCES=YES REFERENCED_BY_RELATION=YES REFERENCES_RELATION=YES ALPHABETICAL_INDEX=YESThis will give you a nice, crossreferenced, browsable source code database where you can click on any function, type etc. and jump to its definition or usage.
Luther Blissett
+4  A: 

A debugger and a text editor. If it is possible to compile and run the code I find the debugger to be best tool to get familiar with new code.

I just look for some part of the code that looks interesting and I try to figure out how it works. I follow the execution of it and trace it in the debugger. It usually leads to other parts of the code and I jump around until I get a general feeling for the code.

When it comes to the general architecture I find the build system and build scripts useful. It usually gives good clues to the overall architecture, the difference modules etc.

Jonas Pegerfalk
+1 for suggesting single-stepping.
Ben Voigt
+4  A: 

Start with whatever non-code-related documentation they have. It very well might be nothing, or very scarce. But knowing that right up front will be important later. If there's a huge amount of documentation, then keeping it up-to-date while you work may end up being part of your job as well. And if it's not, then you can choose to do so or leave it in its current state.

Short of that, I second Sam's vote for doxygen.

eruciform
Seconded. Solving smaller bugs is usually a nice way to get a feeling of what the codebase is like.
Vitor Py
@vitor: agreed, also a great way!
eruciform
+1  A: 

If you're after a tool, SourceNavigator could be useful.

Bruno
A: 

A language sensitive-search tool can make finding your way around easier. Our SD Search Engine provides a language sensitive search tool as a GUI for large software code bases. The Source Code Search Engine (SCSE) makes it possible to easily search/browse a large code base. That in turn makes it easy to "follow your nose" as you are looking for some code about which only have a vague idea of what it might contain (say, an interrupt routine). The SCSE understands the langauge elements (identifiers, numbers, strings, comments, keywords; it even unstands the minor differences between languages, such as C and C++). You formulate a query in terms of language elements, the SCSE finds matches in your source code base, shows you a list of hits, and then lets you select a particular hit to see the associated source code.

It preindexes the source code base to allow you to find sequence of langauge elements, ignoring langauge whitespace, using the language structure to guide the query, e.g, you can hunt for any identifier involving the substring "int" to find that interrupt code. The indexing ensures it is much faster than grep on scale, the language specificity minimizes the number of false positives you have to wade through, and the query langauge doesn't require complicated regular expressions.

Hits on queries are displayed in a hit window. You can select any hit; a single click brings up the source file on the line matching the hit so you can browse that code. The search engine can be configured to take you into most editors directly from the code window if you want.

Ira Baxter
+3  A: 

I find Microsoft Visual Studio with the Whole Tomato Visual Assist plugin is a good combination for jumping around definitions/declarations.

Glutinous
+4  A: 

cscope
ctags

anthony
+1 for cscope... although I've nearly pulled my hair out trying to get decent results with cscope and C++ code. But for C, it's absolutely the best.
Dan
+1  A: 

Check out the code and try and refactor it. You don't even have to check the refactored version back in. I find this usefull for two reasons:

  1. Refactoring code by its very definition splits code up into smaller chunks making it easier for the reader to understand.
  2. To apply some of the more complex refactoring methods you really have to understand the code.

Don't be discouraged - remember it is difficult for developers to keep 7 (give or take 2, depending on complexity) concepts in their mind at any one time. Just ackowledge this and refactor the code to improve understanding.

I suppose it would be foolish not to recommend the following books!!!:

http://www.amazon.com/Refactoring-Improving-Design-Existing-ebook/dp/B000OZ0N4Y

http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052

David Relihan
+1  A: 

I use SourceInsight (www.sourceinsight.com) when I need to understand large source bases. I've yet to find anything else that comes close to it's reference/navigation capabilities.

Bukes
A: 

Some people use OpenGrok.

Paul Nathan
A: 

I wouldn't have said this a couple of years ago, but nowadays Eclipse-CDT has a pretty good C/C++ indexer which lets you jump to definitions and display call hierarchies. That and the CDT keeps a nice history of where you've jumped so you can dig a little then return to your dig-branch point while reading code.

Digikata