views:

1718

answers:

23

Our company has a large codebase (2500+ classes/interfaces in just the core alone, many more in other projects) for our flagship software product. We've never really hired more than one developer at a time, so we don't have a real training process. We're going to be bringing in 2-5 more developers now, and probably more in the near future (to put things into perspective, we have 7 right now.) Obviously, we would like to get these guys up to speed as soon as possible.

The catch - almost all of our classes (95%+) are completely undocumented. No javadoc, no design docs, basically completely undocumented.

What strategies can we employ to bring the new developers up to speed? I'd like to consider situations that include the existing code getting documented, but it's possible management won't allow for the time to get that done, so I also must consider situations where that won't happen.

Has anyone been there before? What worked well for you?

Thanks!

+3  A: 

I think code visualization tools are very handy in these circumstances.

Speaking for .NET, I can say tools like NDepend, Reflector, etc. are very powerful in understanding layers, dependencies, complexity, etc.

Other languages have varying quality tools for the same results.

John Weldon
+9  A: 

You need a high level "collaboration diagram". Show what components talk to what components, where the user sits, where the datastore sits, etc.

For APIs, provide examples! Lots of them. You could point to production code in order to show them, or make small demos.

In short, you need some documentation.

Yann Ramin
+14  A: 

Start documenting as you go, Mandate new class and sequence diagrams as a deliverable from everybody that touches code going forward.

Romain Hippeau
I would include unit tests in the required deliverables going forward.
Pedro
Simply documenting everything won't help you in the long run. As soon as you change something in the code, who will think about changing the corresponding external documentation? I consider having no documentation or wrong documentation to be equally bad. I know this is a little exaggerated, but really, you need to look at documentation more differntiated.
Johannes Rudolph
There's a new program manager, and hopefully he'll be making the other developers do this. We've already had some design reviews that we didn't have before.
glowcoder
Totally agree. What's with documentation obsession? On small teams the code itself and unit tests provide better documentation than any separate document ever will + saves tons of time (read 'money'). Most of the time properly written code is self documented. Documenting every bit and piece of code (or commenting) is a bad practice. Use SVN Blame (or whatever you use) to understand any given line.
HeavyWave
@HeavyWave: Code documents exactly what code does. What documents what the code is suppose to do? A business logic bug in code (e.g. the code is 'correct' and does something...but it is the wrong thing when measured against the non-existent documentation), is very difficult to catch.
semiuseless
+1  A: 

I would tell the new employees to document the code and to write tests for it, it doesn't have to be exhaustive. That way they know exactly what the code does, as writing good documentation needs to take a very close look at the code.

EDIT: perhaps a misunderstanding: I am not asking to document the whole code, that is impossible of course. But some classes, some packages (let's say a week to start with) then there should be a grasp on what a particular section of the code does. When you have a team with different employees have different specialized sections in the code, I think that would really be the best way to do it.

Patrick
If you were to do this I'd bet that most of them would quit after the first day. There is nothing more frustrating than having to document horrible code that isn't yours and you can't figure out.
Earlz
That was one thing I considered. What I'm worried about is, due to the complex nature of the application, that it might get misunderstood by the new guys right off the bat, and that it end up being mis-documented.
glowcoder
OMG - break in new devs by tasking them with doccing code they have no hope of understanding? this is a joke, right?
Sky Sanders
Not only is it difficult - it's impossible.
anon
@Earlz you don't have to overdo it.
Patrick
@Sky, I never said that the new devs should document everything, but let's say one or two classes and a package should a) get some knowledge of the software (coding style, rules etc) and b) should not be too daunting. -- Been there, done that and I think it is a good way to start
Patrick
I agree that this doesn't have to be complicated. If a method or class is that daunting, what does that say about the codebase. It should be possible to look at each element in turn and find what it does, thus to write a reasonable javadoc header. Doesn't have to be exhaustive.
drachenstern
+4  A: 

The best way is to give them very simple tasks and tell them that you expect them to ask lots of questions.
Documentation would help, but the amount of documentation needed for something like that would still make it difficult to get in to, so the best way is by doing it.

Just make it completely clear that you want a lot of questions so that they don't waste too much time being stuck.

ho1
+2  A: 

opening the can of honesty whoopass....

The strategy is to DOCUMENT THE CODE.

Your rational for tolerating your current situation is naive and amateur and no self respecting dev of any talent would tolerate it so you probably wont have to worry about training.

Sky Sanders
+12  A: 

A few thoughts:

  • Run javadoc, doxygen or something similar to produce some sort of documentation now.
  • Have the new programmers produce some unit tests for the existing code.
  • Assign maintenance of a subset of the code base to each new programmer (or even all the programmers), so that they can digest a smaller part of the codebase.
  • Related to the previous point, have someone re-write a class or some other subset of the codebase. On a previous job I totally re-wrote one of the major functions of our custom database; it took a month, and I didn't achieve as much code reduction or performance improvement as I hoped. However, I knew that code then, and the code - which had undergone countless revisions by countless programmers, each using their own style - was consistent.
GreenMatt
Javadoc comments without summaries and explanations are kind of useless. It will just document names of methods and arguments, the same things an IDE will show you anyway.
matt b
@matt b: I agree that javadoc doesn't produce the most helpful documentation if there are no comments in the code that will be put in the documentation. However, not everyone uses an IDE. Also, it's a starting place. And who knows? Maybe some long departed programmer started something that has been missed or forgotten. More generally, if you're using a language without a javadoc-like utility, doxygen (if it can process your language) will give you similar help.
GreenMatt
Why the downvote?
GreenMatt
+5  A: 

If you throw a bunch of new devs into this, the situation will quickly get worse. Think of all the redundant code they'll introduce because they'll never find what they need in the huge undocumented codebase.

Start documenting now. Investigate tools that will automatically generate at least stubs for you to fill in.

Set some rules for all devs (old and new) to follow going forward.

Have code reviews to be sure everyone (old and new) are following the rules and not creating duplicate code.

Jason
++ for code reviews
thursdaysgeek
+19  A: 

I'd take a "campground approach" to the documentation Starting now, institute an informal policy where current programmers should leave any code they look at / work on in a better state than when they arrived. That means add javadoc to methods that they're currently working on, and comment any code sections that they find themselves spending any significant amount of time re-understanding.

When you bring the new people in, you'll probably have them working on some subset of the code or on new projects that hang off the core. Basically, if the method they plan on using is undocumented, they can add the documentation for that method (after all, before they use it they should know what it does).

You'll also need to get some high-level documentation together for the new recruits. 2500 classes is a lot to digest in one go, so they'll at least need to see how the app is partitioned. Automated tools might help though my experience has been that they give too much detail.

Two things that would be great but might be hard to get approval timewise for: * pair programming -- pair newbies with experienced peers on their first couple of projects * TDD -- well written tests are a great form of documentation.

roufamatic
+1 for the campground approach.
Kena
+1! If you see it's broken, try to fix it. New developers don't know your code - and they don't know if it's futile to try to change it! That can be an asset as much as a liability, and you only get to use it once per new employee. They don't have the apologist mindset for the code yet. Use that fresh look to fix things (until they get tired of cleanup). Then do the same thing with the next batch of new devs. Eventually you'll have something decent.
Jason
+1 for documenting as need be as things go along. This being said, if the code is not working well now, than it might be a good idea to stop and fix before moving forward on the wrong grounds.
Shawn
+1  A: 

Set aside a day or so, get everyone in a conference room and explain the high-level structure of the system. Use simple diagrams on a whiteboard. (Preparing well for such a session will leave you with some form of high-level design document; which could become the core of the system documentation you currently lack).

Then have each new developer work with an experienced developer on their first number of tasks. This will slow you down in the short term, but it is the quickest way I found to get the new guys up to speed.

A discussion with a knowledgeable colleague is usually a much better learning experience than reading a bunch of text.

Dawie Strauss
+12  A: 

I recently trained a second developer on a project I have done alone until now and that I'll soon hand over. To put things into perspective, the project has around 50k LoCs and has been developed in a time frame of about 2 years. Releases are provided weekly and the software is constantly improved and features are added.

The codebase is well documneted on the technical side, but little documentation exists on conventions, architecture and the business perspective. An additional was that there challenge were some "glitches" in the codebase that made certain parts difficult to understand (mostly due to overgeneralization/overengineering) and the new developer was unfamiliar with the technology stack and the business domain.

What we did to get "up to speed":

  • High Level Overview of the software, what does the software do and why (business perspective), Architecture and Technology Stack (technical perspective)
  • No special development environment setup neccessary, everything included in the source control repository
  • Pair programming on new features, bug fixes
  • Business Process Documentation, "Users do what, why and how"
  • Moved from a horizontal partitioning scheme to a vertical one, structuring modules by use cases
  • Diagrams for the database, definition of aggregate roots for the domain
  • Pointing to training resources for the technologies involved
  • Meetings with the customer and end-users

To sum it up, here's what I learned from this experience (it's still going on and I consider it a very important challenge to learn from):

  • Don't underestimate how much it takes to understand the problem domain. Don't overemphasize the importance of the solution domain (i.e. technologies).
  • Explain important design decisions, take time to talk through conventions as you go by. If you notice they are not documented, document them.
  • Analyze why certain parts of your project are hard to understand: Is the problem inherently hard or is the solution to complicated?
  • I had infrastructure and domain code very well seperated, but the infrastructure was not as lightweight as it could be. Keeping your infrastructure "lean" will make things easier for a new programmer.

When creating new documentation as you go, make sure you document the right thing. Documentation that is not in the code (external documentation) should stay abstract and explain concepts, not (implementation) details. If you don't adhere to this rule, you are bound to create legacy documentation right away. On the other hand, code documentation should explain all special cases that are handled and the implementation decisions made.

Johannes Rudolph
+3  A: 

What strategies can we employ to bring the new developers up to speed?

Don't hire more developers.

More Developers != More Work

I see all too often companies wanting to get ahead quick and thinking that hiring will do the trick. The answer is no, and quite the opposite. Hiring more developers takes time away from your existing devs until they are fully trained and up to speed. (For more, check out: The Mythical Man Month)

rlb.usa
Who said what these new developers where going to work on? Maybe they are going to be working on a new set of problems as a separate team. They will be using a common core library, but might focus on two completely separate goals. IE the Windows team and the Office team. They both end up using the same core library of code and they affect each other, however their problem domains are separate.
entens
I would extend this by pointing out that Organic Growth is what you're looking for. Doubling the size of a team almost never works. Adding low double digit percentages works. If your turnover is higher than that, you have problems that hiring won't fix. A team this size, I would say they should push for no more than 2 new devs this year, 3 next (<30% growth). If the company insists on hiring more people, then get some support staff. Nobody ever has enough testers, and even an Agile team can benefit from a build master - if you get someone with an Agile mindset. Don't forget better tools.
Jason
+2  A: 

Set them simple tasks and assign mentors to show them the ropes.

The best tasks to start with?

Get them to interrogate their mentor on a class till they understand it, and then write up the documentation for it. In this way, they will gain familiarity with the code and simultaneously start to attack your mountain of technical debt. Once they understand the code, you can assign them programming tasks on that code.

In addition, I'd recommend:

  • Mandate that all new code going forward is documented. That's how the most successful companies work. Why are they successful...?

  • Allocate as much time as possible (even just 1 hour a week) for all your devs to go back and document their existing systems and classes. Even the most basic documentation (system overviews etc) will make a huge difference to the maintainability (and therefore the ongoing cost of development) of your codebase.

It's not a cost. It's an investment. And it will pay off within a few months.

Documenting code not only makes it easier for teammates to understand and maintain (which makes development faster and lowers bug rates), it also forces people to think about their designs (by trying to "explain" them to someone else), which generally results in better designs (which makes development faster and lowers bug rates).

Jason Williams
"It's not a cost. It's an investment" Along those same lines, neglecting documentation is akin to going into debt. It's a time investment that you can put off, but you can't avoid it completely. The longer you wait to do it, the worse it will get. If management has a problem with the time involved, make them understand that it only gets worse.
bta
+2  A: 

IMHO, in most systems there are two kinds of things that really need to be documented: 1. Explanation of particularly complex or tricky algorithms. 2. Overall design and structure.

If you have a mass of undocumented code, my advice is, spend some time on #2.

In case that's not clear, let me explain.

Most systems have lots of code that's very straightforward. No one should need documentation to explain, for example, that function X is accepting a file name as a parameter, tacking a default directory on it, opening the file, and returning a handle to it. If that's not clear from reading the code, either the original author created a mess or the current reader is incompetent. There's no need to document this code. It drives me nuts when I see comments like "Open the file" or "x=x+1; // Add one to x". Like wow, thanks, I never would have figured that out without that help.

But most systems also have some amount of code that, even if the original author wrote it as clearly as he possibly could, is still difficult to understand. This is where good documentation helps. But if they haven't been documented, finding these blocks of code, figuring them out, and documenting them properly is a huge job.

But #2 is manageable. Every system has (hopefully) some overall purpose and primary functions that can be described, like "The key functions are processing payroll, managing retirement benefits, and maintaining our enemies list. To process payroll, we read in the number of hours worked by each employee" etc. There should be some overall structure: major subsystems that interact with each other, etc. There can be key design philosophies: Here's how we select the next module to execute in a sequence. Here's how we select identifiers for tables. Etc.

If a few people who know the system well sit down and work on it for a few days, they could probably write up a lot of this high-level stuff, and that could be a big help to the new people.

Jay
+6  A: 

Whenever I am faced with a new code base, documented or not, I start like this:

1) End user experience. How do end users view/experience/use this product? What is the user interface? Are there more than one kind of user (e.g. user, admin, sysadmin, content author, etc)? If so, what is the perception of the product for each kind of user. What kind of install or run time configuration options are there? What do the typical sysadmin or upkeep tasks look like? Work through any existing end-user documentation.

2) Build process. What does it take to checkout a new tree, build, and package the product? This should be repeated for every platform that is supported by the product. If the answer is not "run this single command"...then there are issues for improvement.

3) Test Suite. What does it take to verify that the build/packaged product is correct. How can I run an automated test suite, or run sufficient hand tests to know that the built product is correct.

4) High level architecture (Packaged/Installed). What is the arrangement of the application in a packaged/installed form? Where are the important files? What other software/hardware requirements are there for a successful installation?

5) High level architecture (Source/Build). What is the arrangement of the source code and makefiles or build tools? How does that arrangement map into the packaged/installed version of the product.

6) Low level walk through of "critical code". I want a "street level" view of the user interface, main libraries, API's, test code, etc, etc.

7) Change control or branching policy: What is the policy and procedures for getting a change into the product? Design docs? Bug tracker? Desk checks? Code reviews? Branching policy? QA involvement?

8) Initial coding tasks (with mentor & guidance): - Write a new test - Fix a minor bug (typo in message, better message, debug output, etc.) - Alter the packaging to include/exclude a file - Fix a larger bug

9) Repeat coding tasks until you are basically functional checking out the product, building, making changes, and getting those changes back into the source control.

10) Take ownership of some module/library/hunk of code. Reverse engineer and document the code, and present to the remainder of the team.

semiuseless
+1 This is virtually my process as well, especially the first few steps.
Jason
+6  A: 

I've been in situations where we've brought in a group of multiple new devs at a time, never with such a poorly documented codebase, though. A few thoughts:

  • We started a developer wiki. We uploaded some of our high level architecture, reqts & design docs into it, and then created a few pages that link to those documents. We ended up with 'New developer project intro', as well as a 'New developer environment set-up guide' pages. Give new devs a wiki account on their first day, and encourage them to edit/contribute. I often have my new devs tweak the dev env set-up guide as they're working through it.

    • Make sure your new devs get time/training they need on any new tools/libraries they'll be using. If you're using Spring (for example) & they're new to that, give them some time to go through a Spring tutorial of your choosing. That helps cut down misunderstandings that are just based on use of a new tool.

    • If it's possible, consider writing your own tutorial for a common dev task, or choose the best code to use as an example. If a new dev has got to right a new JSP page, what's the best place in your codebase to find a good example of how you want things done? Document that! Don't make the new devs search the code base to find samples.

    • Whiteboard, take photos of finished whiteboard sessions for future reference (post them on the wiki!)

    • Require all your current devs to help with documentation. Write diagrams for each high level module. Find & use a tool (or your IDE) to generate javadoc tags. Fill them out. Start with class level javadocs of your most used classes & interfaces. If you can't get a time block for this, have old devs do this for each existing class they use while doing development.

    • Involve your current devs & have them partner up with the new devs in their first assignment. Maybe not in a strict peer-programming sense, but old devs should be able to guide new devs to good code examples & approaches.

    • Hold both design & code reviews for all new code. Require some level of documentation for each, as much as your management will allow time for.

    • No new code should go undocumented or untested.

    • If they're not totally junior, consider having the new devs fix a few (relatively easy) bugs before they do any new development.

elduff
+1 for the Wiki. You could also use a forum which also encourages asking questions and collaboration.
Shawn
A: 

Set the new guys to fixing bugs in old code. This will give them relatively short, bounded-length problems that they can solve. Learning the code will be a side-effect of this, and they'll also know enough to document some things as they go. By the way, this is what Fog Creek does to get people started right away.

The problem is that, "go learn the code base," even in the best-documented project ever, will never work. When are you done learning everything? How do you show it? Then when the same dev actually tries to go back and write something, they find out, "oh, I misunderstood this. I wasted all that time. Oh no..." It's depressing, and it does waste time.

By and by, "whatever you change, leave the code looking better than when you found it." This should go for everyone on the project; now that you're getting bigger, it's important.

Andres Jaan Tack
A: 

Not trying to be flippant but set the expectation with management that these developers will be counter productive for the duration of the legacy discovery period.

That is the period of time where they can state they learned something new about the legacy code / architecture

Just like all the other documentation recommendations, I suggest you setup a wiki to document each of these a'ha moments and measure the duration between updates.

Consider the developers to be fully productive (or totally saturated) when the 'new discovery' rate falls to less than once a week.

Stevko
A: 

When I am thrown into an existing project with a lot of undocumented code, I run Doxygen on the codebase. Even if the code isn't set up for it, the utility will still provide a lot of useful information about the relationships between objects, what functions call what other functions, etc etc.

Oh, and if management "wont allow" you to spend the time to document the code, just remind them that for every hour you spend documenting the code, you are saving other team members hours of frustration. Management can sometimes be too near-sighted to realize how big the ROI is for code documentation, especially if they were never coders themselves. If your classes are set up for unit testing, you can have your new hires take one class at a time (start with the simple ones), use the unit tests to try and figure out (working together) what the class is supposed to do, and write up as much documentation for that class as they can. That way, they're learning the codebase while helping solve the problem (you'll want to have some of the experienced devs check over it, of course).

bta
A: 

I don't believe in the power of so called "documentation" within or around the code. These comments tend to be inconsistent with the reality of the code, especially, if the comments are written at a much later date than the code itself. You could make a small and not too intrusive start with doxygen (or whatever documentation tool you like) by marking all important classes in the header files with /// so that you can get a rough overview generated from the code.

I much more believe in the approach suggested by Johannes Rudolph above. Perhaps you could make the current lead developers give a 1-hour introduction into their part of the code and record these sessions with a cam-corder so that

  • the new coders can review the most important parts
  • the ideas remain with the company if in some future the lead developers leave the company or the division

From my experience, pair programming generates the deepest insight into existing code. Perhaps you could even use the chance and turn to agile programming methods - 7 old programmers and 4 or 5 new programmers could be assembled to lets say 6 teams. (Please bear in mind: I do not recommend introducing scrum at this point, as the performance of the newly assembled teams cannot be predicted and the initial sprints could create severe frustration with the new programmers). Re-forming the development would somehow make all programmers "equal" and could generate more motivation for all. But this would require backing from the management ...

It would be interesting to hear from you in some weeks time to see which path you followed and if you were successful.

Jens
A: 

Kill two birds with one stone. Make your new developers read through the code and develop the documentation you should of been making from the start. When they get done they will be well versed in all of your code and all of your code will be well documented!

typoknig
A: 

Yes, many of us have been there. There are reasonable solutions.

Having undocumented code may be a problem, but it's not the problem. The problem is-- as you have stated-- getting more people trained and productive on the codebase.

In this situation I have had success:

  • Pair programming is an excellent way to transfer knowledge. Stick new developers with current developers and you'll be surprised how quickly they come up to speed.

  • Screen for candidates that are committed long-term. Explain the situation and ask and expect developers to stick around long term. We did this and had very low turnover on the development team, so the documentation issues were not that important.

  • Get new developers involved with the customers to make sure the developers know the domain. We had success involving developers in on-site training very early on in their work at the company. They learned the domain and the software and the customers all at the same time. Trying to figure out the overall goals of the system is hard, but once you understand the overall goals, you can reason about specific code should work and the lack of documentation won't matter as much.

  • When you work on documentation, focus on high-level questions. Developers should be able to reason through undocumented code in the micro as long as they have some sort of framework to fit it into.

  • Keep asking the developers what the most efficient way to train people is. Is better for them to write a system architecture document, or do a 1-hour talk without prep? Re-evaluate periodically.

ndp
A: 

Its never too late to start! That being said, this is no doubt an uphill task.

I would suggest that you start off with the basics, point out the developers to the general design and code patterns and have them sniff their way through.

Make sure that they pay for this vacation - in kind! Have them document the stuff and you oldies can review it (price you pay for not doing the dirty work from beginning).

I'm sure if you do this exercise, it would have your entire team know in and out about the stuff.

Srikanth Venugopalan