How do you find your way around a new codebase

views:

998

answers:

+11 Q:

How do you find your way around a new codebase

You don't always get to talk to the original author or authors of the code you maintain. Sometimes when I work on an existing project I feel like a special forces operative behind enemy lines trying to figure out the lay of the land without the use of a map. When you are in this situation how do you get the code under control? How do you:

find out how things work
understand the mindset of the previous authors

So many times I hear other developers say well this just needs to be rewritten that can't always be the answer. What is yours?

I personally try and change the code and see what happens. Running locally on my desktop of course. Ideally, the code will contain unit tests. That's a great way to learn about what the code really does etc...

Haacked 2008-09-16 07:43:01

+5 A:

Try to buy as much time as you can get for analyzing the existing code and other application files. Never assume this time to be negligible when estimating the effort. When working on maintenance projects, almost always there will be near to zero documentation of the code.

Usually analysis part takes more time then the development time. Try to include this time also into the estimates.

Try to see if any sort of (proprietary or not) framework used to build the software. Analyze everything from a top level view and zoom into specific code as needed.

Niyaz 2008-09-16 07:44:27

+17 A:

Learn to blackbox things.

Figure out what functions do, but not how they do them. This makes it take far less time to understand the program's structure as a whole. Then, once you understand how a particular segment of the program works, then break down to the individual functions and try to understand not just what they do, but how they do them.

On the open source project I've worked on for over a year, I started knowing nothing and learned its structure in this manner. I still have a few functions that I find myself "blackboxing"--I pretend they work magically and ignore the actual process, since I don't understand fully how they work, and instead just assume that they do what they're supposed to. This makes overall analysis of the application much faster and easier.

Dark Shikari 2008-09-16 07:47:35

Blackboxing is the way to go, especially if you can write unit tests that don't currently exist for those parts in the process.

warren_s 2008-09-16 07:51:18

The catch-22 of a black-boxing approach is that you need a frame of reference to start with, otherwise the whole system is a black box.

Ed Guiness 2010-07-29 08:04:35

+2 A:

Resharper is invaluable for jumping around through the code. Ctrl + B, Alt + F7, etc

2008-09-16 07:47:45

I love Resharper too, but both "Go to Definition" (Ctrl+B) and "Find Usages" (Alt+F7) come with VS.NET so those without Resharper can use them too. Resharper does it better by giving you more control over the results ...

flipdoubt 2008-10-29 13:07:05

If at all possible I would try to profile the system in some way to get a handle on the dependencies at multiple levels.

I like to work visually so a tool in .NET from Redgate or NDepend would be ideal. ALthough not perfect in the absence of "good" documentation, this is the best way to get inside the minds of the previous developer/s.

The major goal is to understand how difficult/easy it will be to make future changes without accidently impacting other areas of the code.

Ash 2008-09-16 07:47:50

+4 A:

Generally, if I have an idea of what functionality I'm supposed to be fixing/extending, I will follow through the use case for that feature, set a few breakpoints here and there and just try to get a feel for what's going on in the back end.

Reverse engineering class diagrams and DB schemas is handy, especially if it's automated, but at the end of the day, you just have to read, read and read some more code.

Heaven forbid you should actually find comments that are both relevant AND up to date with the source. If there's a test suite, that's great, but if not, I try to write some tests for code before I start changing it, and always use a light touch wherever I can.

Depending on the size of a project, it can take days or even weeks to get a real handle on part of the codebase, my view is that in any decent sized project you can never really know any part of the code the way the original author did, so the best you can hope for is little more than a passing familiarity with it.

warren_s 2008-09-16 07:48:28

+2 A:

Run an API documentation tool like Javadoc or Doxygen on it, and with that in hand start by choosing a feature of the application, and trace the path of execution. This generally exposes the mechanisms the author put in place to accomplish various tasks, e.g. the commonly used utility classes/methods. This also exposes the level of coupling present between components/classes, various patterns used, etc.

You don't truly know how it works till you've traced it's execution a few times.

Chris 2008-09-16 07:53:48

+7 A:

It might sound obvious, but start at the entry point of the program and step through the code to see how it sets itself up.

Take time to absorb code that isn't immediately obvious since that will give you a flavour of the previous developer's style.

Did they like to modularize?
Did they favour interfaces?
Did they write tests?

Take a look at the data that the application persists. If it has been modeled properly you will find important clues about the key entities of the system. You might be lucky and find some declared referential integrity (foreign key relationships) that gives you more clues about how the data is related ("a customer has zero or more orders").

And don't forget that non-technical staff can often have good insight into the structure of the system.

All the while, tools like source insight (there is a demo available) will help you navigate the code base.

Ed Guiness 2008-09-16 07:55:09

Thanks for the mention of source insight, i never knew that such a program existed.

Ibn Saeed 2009-07-11 21:02:47

Any FOSS alternatives to source insight?

Myth17 2010-07-29 06:52:49

REally depends on the language, but for C/C++ I start by absorbing all I can from #include files first

Scott Evernden 2008-09-16 08:01:31

+1 A:

If you have good documented code and several unit tests I don't see much of a problem. You can get pretty far just by analysing java doc and reading the code. Especially if it's good code where every method and class has a specific purpose.

But at work I usually stumble over large code chunks with different code styles and mixed up patterns. If variables are called a, b, c and the methods are called do() or calculate() I start debugging. It is very helpfull to put a breakpoint somewhere deep down in a method and analyze the different layers of callers and what state the variables are in.

In general: if possible I find runtime analysis much faster and more helpful than just looking at the source.

Martin 2008-09-16 08:02:18

+2 A:

Two tools I use frequently are ack and ctags (or similar as appropriate for the language in question). Also, lxr or similar is helpful to view a hyperlinked version of the source. These tools help in navigating the code base.

As for understanding the intent of the original authors, there are several techniques I use:

"When in doubt, print more out." Debug output is often invaluable.
Tracing code in a debugger (if possible) is helpful but there is the risk of getting too deep in the details.
Tests (if any) often give some insight.

Greg Hewgill 2008-09-16 08:03:36

+3 A:

You can analyze a bicycle till you are blue in the face but you wont really know anything till you peddle it around the block. With this in mind...

Pick a bunch of features you want to learn about and write some quick throw away functionality that requires touching the interfaces to all of them.

Rinse and repeat

2008-09-16 08:17:40

+1 A:

Personally I prefer to to insert my guides (little text messages that say exactly where i am within the code) within control structures and looping blocks, for example the text can say which file i am in or any sort of information that makes sense to you. If it makes sense to you, then it eliminates the learning curve. That way as the code gets executed i can verify exactly wheat part of the code is being executed. This is especially good with web based applications.

I found this method really helpful and quick some years back when i had none documented code i was expected to figure out and maintain in one weekend.

Steve Obbayi 2008-09-16 08:30:31

+1 A:

Consult project's VCS

A new codebase is usually a legacy codebase, and working with legacy code requires skills in Software Archaeology (and a high toxic resistance, but that's another story). Any Software Archaeologist knows that project's revision control system is an invaluable source of data and amusement. Showlogging, blaming and diffing can tell you a lot about project's early development stage, how it looked when it was still simple and manageable, which modules formed the core and have been part of project from day one and which appeared later.

Unless, of course, the team happily killed their SourceSafe repository after swithcing to CVS and did it again after switching to SVN :)

Constantin 2008-09-23 18:55:50

+3 A:

Check out Joel's brilliant Reading Code is Like Reading the Talmud. It covers the general process and is a great read. I use this approach whenever I can get a second developer to join me in it (You need two people to do it).

Tools can help, but don't focus too much on them. I'd say they have more meaning at a later stage, after you already understand what the code does, when it comes to refactor the thing.

Hanno Fietz 2008-09-24 15:23:30

I need to see the overall structure before I start digging into the mechanics of the code. So printouts and whiteboards are useful to me. Your code might create a new instance of 'WidgetWibble' in the 'Foobar()' function... but what's that mean?

Then I start with the UI and work out how it works.

I'm quite good at looking at code and asking difficult questions that end in "why?", which provoke responses of "err... well it was 3am and the code was due for release the next morning".

Piku 2008-10-06 11:39:42

+1 A:

none 2009-03-13 20:09:06

+1 A:

Michael Feathers had a good tip on this that I since has adopted from time to time. Make sure that you have the code base under source control and go crazy with "extract method" refactorings on the parts of the code that you find tricky to directly understand. Don't be afraid to break anything: when you're done, revert all your refactorings and start over using real production techniques.

PHeiberg 2009-03-13 20:58:38

ansaurus

tags:

views:

answers:

How do you find your way around a new codebase

related questions