What is the best approach to reading and understanding the code of a large scale project?

views:

159

answers:

+4 Q:

What is the best approach to reading and understanding the code of a large scale project?

Say you decide to work on Open Source Software and you are faced with a project with millions of lines of source code, what do you do? What are the best methods to tackling this project and getting a good understanding in the shortest possible amount of time. I'm constantly facing new code bases and am curious to find out what approaches other developers have.

+3 A:

Usually, I need a bug fixed, so I either use a stack trace, Google or I ask on the developer list where to start. Then I work from there.

Aaron Digulla 2009-10-05 14:27:23

+2 A:

Get your hands dirty with it, inspect unit tests if there are, or write unit tests to verify your expectations of individual pieces of code. After roughly understanding most pieces of the code and what they do, check for integration tests (again, if there are) to see how these pieces are "wired" to do things.

nkr1pt 2009-10-05 14:28:56

+2 A:

Look here: Hired as a developer to maintain and update current code base, no docs!

There was much very good advice given.

Developer Art 2009-10-05 14:30:18

+1 A:

As a developer who recently started working on a large community source project (Kuali), I feel your pain. The project team should have created a developer orientation manual and it should include how to set up your environment to work on the project. Read this type of documentation. Join the mailing lists. Next step is to start taking small bugs to fix. This will give you a chance to to dig into the code and start making minor changes. From there it is a gradual progression of understanding. At least that is what I have done. If there are ever any developer or community face to face meetings, try to go them, they are incredibly helpful, and can help you get to know your team better.

-Jay

Jay 2009-10-05 14:32:53

+1 A:

I worked for 5 years on a system on 3rd level support - which means fixing everybody'd bugs It is a very well known ETL package, supporting legacy database and files and almost every relational database and platform. Written in mainframe (IBM) assembler, c++, c, VC++ etc.

The technical documentation/comments were old/not maintained (system changed very fast to add new features), and not much help from developers (very busy adding the new features).

What I found was that knowing what the product was supposed to do was crucial. The best people for this was the QA group, who read the user manuals and tested it comprehensively.

After that, it was reading the code and then putting in extra displays/logs to see what was going on in the area of interest.

Sometimes, if it was an intermittent problem and data related, you could not recreate the issue. You HAD to run the code through your head to see what was happening, especially if there were multiple threads involved. Once you figured it out, you could then artificially recreate it to prove it.

So - get friendly with the testers, who know what the product should do.

Get used to reading code (I used to read mainframe dumps to follow code at one time, and have even debugged and zapped machine code without the source). Try to think the way the computer treats the program and follow the line of logic through. It is not as hard as you think and makes for an excellent puzzle.

Oh, and global search of the source is also very, very useful.

2010-08-17 09:08:36

ansaurus

tags:

views:

answers:

What is the best approach to reading and understanding the code of a large scale project?

related questions