views:

263

answers:

7

I have a source of python program which doesn't have any documentation or comments. I did tried twice to understand it but most of the times I am losing my track, because there are many files. What should be the steps to understand that program fully and quickly.

A: 

I'd start with a good python IDE. See the answers for this question.

Stephen C
+5  A: 
  • In past I have used 'Python call graph' to understand the source structure
  • Use a debugger e.g. pdb to wak thru the code.
  • Try to read code again after one day break, that also helps
Anurag Uniyal
+4  A: 

I would recommend to generate some documentation with epydoc http://epydoc.sourceforge.net/ . For sure, if no docstring exists, the result will be poor but it will give you at least one view of your application and you'lle be able to navigate in the classes more easily.

Then you can try to document by yourself when you understand something new and then regenerate the docs again. It is never too late to start something.

I hope it helps

luc
+2  A: 

I have had to do a lot of this in my job. What works for me may be different to what works for you, but I'll share my experience.

I start by trying to identify the data structures being used and draw diagrams showing the relationships between them. Not necessarily something formal like UML, but a sketch on paper which you understand which allows you to see the overall structure of the data being manipulated by the program. Only once I have some view of the data structures being used do I start to try to understand how the data is being manipulated.

Secondly, for a large body of software, sometimes you need to just attack bite sized pieces at first. You won't get an overall understanding straight away, but if you understand small parts in detail and keep chipping away, eventually all the pieces fall together.

I combine these two approaches, switching between them when I am getting overly frustrated or bored. Regular walks around the block are recommended :) I find this gets me good results in the end.

Good luck!

James
+9  A: 

Michael Feathers' "Working Effectively with Legacy Code" is a superb starting point for such endeavors -- not particularly language-dependent (his examples are in several non-python languages, but the techniques and mindset DO extend pretty well to Python and just about any other language).

The key focus is, that you want to understand the code for a reason -- modifying it and/or porting it. So, instrumenting the legacy code -- with batteries and scaffolding of tests and tracing/logging -- is the crucial path on the long, hard slog to understanding and modifying safely and responsibly.

Feathers suggests heuristics and techniques for where to focus your efforts and how to get started when the code is a total mess (hence "legacy") - no docs, or misleading docs (describing something quite different, maybe in subtle ways, from what the code actually DOES), no tests, an untestable-without-refactoring tangle of spaghetti dependencies. This may seem an extreme case but anybody who's spent a long-ish career in programming knows it's actually more common than anyone would like;-).

Alex Martelli
+1 I wonder how many rep points have been handed out for saying "Go read Working Effectively with Legacy Code"?
Nader Shirazie
+2  A: 

You are lucky it's in Python which is easy to read. But it is of course possible to write tricky hard to understand code in Python as well.

The steps are:

  1. Run the software and learn to use it, and understand it's features at least a little bit.
  2. Read though the tests, if any.
  3. Read through the code.
  4. When you encounter code you don't understand, put a debug break there, and step through the code, looking at what it does.
  5. If there aren't any tests, or the test coverage is low, write tests to increase the test coverage. It's a good way to learn the system.
  6. Repeat until you feel you have a vague grip on the code. A vague grip is all you need if you are going to manage the code. You'll get a good grip once you start actually working with the code. For a big system that can take years, so don't try to understand it all first.

There are tools that can help you. As Stephen C says, an IDE is a good idea. I'll explain why:

Many editors analyses the code. This typically gives you code completion, but more importantly in this case, it makes it possible to just just ctrl-click on a variable to see where it comes from. This really speeds things up when you want to understand otehr peoples code.

Also, you need to learn a debugger. You will, in tricky parts of the code, have to step through them in a debugger to see what the code actually do. Pythons pdb works, but many IDE's have integrated debuggers, which make debugging easier.

That's it. Good luck.

Lennart Regebro
A: 

Enterprise Architect by Sparx Systems is very good at processing a source directory and generating class diagrams. It is not free, but very reasonably priced for what you get. (I am not associated with this company in any way, I've just been a satisfied user of their product for several years.)

Paul McGuire