I've just inherited a large project previously coded by about 4-5 people. The documentation consists of comments, and is not very well written. I have to get up to date on this project. How do I start? It consists of many different source files. Do you just dig in? Are there tools that can help visualize the structure/flow?
I would encourage you to buy and read this book thoroughly. It provides you a LOT of information in this regard, much more than you will find here.
Use the debugger to walk through the application. That will let you go both deep and wide. You'll also be able to learn about how the code handles specific scenarios.
When you're ready to change something as @Jaxidian said, Working Effectively with Legacy Code is a great resource.
If you have a chance, I'd try and talk to the original designers and developers. Ask them about any major design issues or shortcomings of the project. Is the project in good shape and only needs maintenance or are there major components that need to be added or reworked? What are going to be the biggest roadblocks to maintaining the project? Take one or two of them to lunch (separately) if you have a budget for it as they might be more free to talk about problems outside of the office.
Talking to the users is also important for getting a feel for the current status of the project. Quite often they have a different opinion of how things stand then the developers do. Make sure, however, that they don't start giving you a list of all the things they want added or changed - you should take a few weeks to understand the project before you can start making major changes to it.
As for visualization tools, I'd start with the database design if there is a database involved. Tools like Microsoft Visio can create a diagram from an existing database. I find knowing the design of the database helps me wrap my head around what the programmers were trying to accomplish. Visio is also good for documenting program flow with some basic flowcharts though you'll have to create them yourself - it doesn't generate them automatically as far as I know.
Good luck.
Try to find the starting point of the system and start digging from there. It sort of sucks to be in that situation, and chances are the comments might not be that helpful either. If the original developers didn't bother (or didn't have the chance) to document, chances are they never kept the comments up to date with code changes.
So time to bring the shovel... but don't just dig in blindly. One thing that is important is to understand what the system does from a users' perspective.
Concurrent with your code digging, you need to meet with a user (or the users' liason) and have him walk through the system, showing you how it is supposed to be used, for what purpose and what it and its subsystems are supposed to do. Moreover, attempt to understand what are the business pre-conditions and post-conditions of each major operation performed with this system.
Then map (or do a hierarchical) chart of the main functions of the system; classify them by category, purpose or module. If the system performs some sort of work flows or business transactions, attempt to chart some sort of state/transition diagram documenting each (and cross-referencing each state/transition to the subsystem or module in the system that is in charge for it.)
Once you have that, you can dig according to function. It will be best if you dig for a specific purpose, say, there is a bug fix to implement. You locate the logical module or category pertaining to that bug fix, you have the pre-conditions and post-conditions; then you can dig precisely on (or around) that bug fix.
If you just dig in without a guide (at least a high level one), you can be digging for months without getting anywhere (I'm telling you from painful experience.)
If there is no user manual, implement a draft according to your meetings with the users/users' liason. That could serve as a guide for implementing a developer's/administrator's manual for the system you just inherited (if there is ever a chance to implement one.)
If code is not on source control, put it on it. Doesn't matter what SCS you pick (could even be CVS, yuck!) What matters is to put it under source control asap.
Those developers didn't exist in a vacuum, they must have had exchanged emails. Identify other tech liasons they work with. Attempt to identify what other systems, if any, this system interfaces to (.ie. your databases, other's peoples databases, cron jobs, etc.)
But this could come at a later time. I think you should, for starters, focus on understanding how to use the system and what it is for. Let's call it understanding its business/knowledge architecture. Then dig according to that... or better yet, according to that and with the purpose of fixing a bug.
Good luck.
Brainstorming a little for you:
Step around in the application with a debugger, use a Static Code Analysis tool for which ever language you are working with...
Talk with people - both developers AND USERS to get a feel of the application.
Review the issue tracking system to see if you can see any recurring types of problem...
I was recently in a similar situation. What helped in my case was focusing on the changes I needed to perform on the project, and in the process of making those changes I learned about how the project is structured and so on. Sure, the first few tasks took a bit longer, but look on the bright side: I got stuff done and I got familiar with the project at the same time.
Are there tools that can help visualize the structure/flow?
The latest Visual Studio 2010 allows you to generate architecture diagrams.
http://ajdotnet.wordpress.com/2009/03/29/visual-studio-2010-architecture-edition/
I'd suggest two things that may help:
Be productivity-driven. In other words, find a change that needs doing and use this to learn how that bit of the system works. Your changes may not be the most elegant without a whole-picture understanding of the software, but you will get work done within days/weeks.
Follow things from the user-interface. I.e if a change involves things a user does on a dialog, find that dialog in the code (relatively easy) and then work backwards to see what bits of the code provide data to the dialog, how the dialog interacts with the system, etc. Trying to find "where does X happen in the code" is very hard without good documentation, but finding "where is the code relating to this dialog" is quite easy and gives you an entry-point into the code.
Whenever I start a new project, I spend 2-3 days skim reading the code and making notes. I basically go through the entire solution from top to bottom and make a map in a text editor of each (significant) class in each project and what it appears to do.
The aim in doing this is not to completely understand the entire codebase, so don't worry if you feel you are not getting your head around it completely. The aim is that you end up with an index of where to go when you need to start on your first piece of work. You should also end up with a cursory picture of the solution in the back of your brain that will get filled in over the next couple of months. I always do this on the first few days as your superiors will not expect you to be productive during this time and you may never get another opportunity where you have the time to do so.
Also, do not rely on code comments for direction. Even with the best intentions they are often unmaintained and may lead to incorrect conclusions about what a class or section of code may do: a comment may lie but the code always tells the truth.
- Use Profiler to see main functions and events in your project (the fastest way to learn framework)
- Learn business logic very well to better understand the code
- Documenting every new thing you learn - setup wiki (you will be surprised how quickly things are forgotten)
- You can use Visio to draw Database Model Diagrams. (keep them close to you while studying the code)
These are the things that helped me when I inherited the previous project (50+ developers, 70+ GB database, 1 GB of source code and not even a single line of comments in code (maybe few :), and everything written in foreign language )
Presuming there is a database, start with the data model. Somewhere (Mythical Man-Month?) it was written "if I have your tables, I don't need to see your code."
Regarding potential tools, you may want to look into NDepend. It is a code-analysis tool, with an emphasis on highlighting the internal organization and dependencies of the code base (see this post for typical outputs), and spotting code quality issues. I have not used it personally, but Patrick Smacchia, one of the developers of the product, has a few posts where he applies NDepend to some classic apps (here is NUnit for instance) and discusses what it means, and I found them interesting.
Go and speak to the users or, read the manual and / or if one exists, go on a training course for the system (internal training departments will sometimes have put them together if there are lots of users).
If you don't know what it's meant to be doing then the chances of you being able to work out how it does it are close to zero.