I'm starting a new project in a language I'm less familiar with (FORTRAN) and am in the 'discovery' phase. Normally reading through and figuring out code is a fairly simple task, however, this code is rather large and not so structured. Are there any methods/tips/tricks/tools to mapping out 50k lines of rather dense code?
I always find the starting point of execution, or where other code (that I'm not working on) calls the code I'm examining. Then I just start reading through it from there, following method calls as necessary to figure out what's going on.
The debugger is your friend - if you have one for Fortran.
Before you go too much further I would familiarise yourself with the basic syntax of the language plus any foibles like assumptions about variable types from their names and positions of declarations etc. If you don't get that stuff then you are likely to get very lost even with a helpful debugger.
Remember as well that structure is sometimes language dependent. The code you are looking at may be badly structured for the languages you are used to but may be very well structured for Fortran, which has its own set of peculiarities. I think I am just saying have an open mind to start with otherwise you'll be carrying around the unnecessary predisposition that the code you are looking at is bad. It may be, but it may just be something you are not used to.
Best of luck. I rather liked Fortran when I programmed in it for a living about 20 years ago, and it is still the language of choice for some applications because of computation speeds on some platforms. Still quite a lot of it in academia.
When I coded Fortran (F77) thirty (yikes!) years ago, we had limited facilities to automatically flowchart an unknown codebase. It was ugly, and limited to the real-estate that a plotter bed could supply. As @Simon mentions, you can (and could back then, with some versions) also use a debugger.
Now, interactive exploration is easier. Additionally, you can experiment with IDEs. I have not personally tried it, as Fortran is no longer my key development language, but Photran is an Eclipse plug-in for Fortran, and appears to be under active development (last release was this month).
Hi
Take heart. One of Fortran's virtues is that it is very simple. Unless you find a code which has been programmed to take advantage of 'clever' tricks. I suggest that the first thing you do is to run your program through a compiler with syntax-checking and standards-compliance turned up to the max. Old (pre-Fortran 90) FORTRAN is notorious for the clever tricks that people used to get round the language's limitations. Some of the gotchas for programmers more familiar with modern languages:
-- common blocks; and other mechanisms for global state; especially bad are common blocks which are used to rename and redefine variables;
-- equivalences (horrid, but you might trip over them);
-- fixed-format source form;
-- use of CONTINUE statement, and the practice of having multiple loops ending at the same CONTINUE statement;
-- implicit declaration of variables (to sort these out, insert the line IMPLICIT NONE at immediately after the PROGRAM, MODULE, SUBROUTINE or FUNCTION statement everywhere they occur);
-- multiple entry points into sub-programs;
-- and a few others I'm so familiar with I can't recall them.
If these mean nothing to you, they soon will. And finally, you might want to look at Understand for Fortran. It costs, but it's very useful.
Regards
Mark
Is it Fortran IV (unlikely), 77, or 90/95? The language changed a lot with these revisions. Some of the gotchas that High-Performance Mark listed were common in Fortran IV but uncommon in 77, while others were still common in 77. The suggestion to check the code with maximum compiler warnings is excellent -- even use two compilers.
I'd start by diagramming the subroutine structure, and getting a high-level view of what they do. Then get a deeper understanding of the areas that need to be changed.
There are tools for analyzing and even improving Fortran code, e.g., http://www.polyhedron.com/pf-plusfort0html or http://www.crescentbaysoftware.com/vast%5F77to90.html.
Are you running on Linux or OpenSolaris? If so, the Sun Studio Fortran compiler is one of the best. And the Sun Studio IDE understands Fortran and comes with a debugger. http://developers.sun.com/sunstudio
I'm sleepy so I'll be short :)
Start by grepping out (or whatever tool you use) program statements, subroutine statements, function statements and the like. Maybe modules and such if f90 is used. Draw a kind of diagram based on that, on which you'll see what calls what (what uses what subroutines, functions and the like).
After you've got a general view of the situation, get to data. Fortran requires everything to be declared at the start of program/subroutine ... so the first lines should give you declarations. After you've putted those in the diagram you just made, you should have a very clear situation by that time.
Now, the next step depends really on what you want to do with it.
@ldigas: "Fortran requires everything to be declared at the start of program/subroutine ... "
No, unless IMPLICIT NONE appears at the start of a routine, Fortran uses implicit typing. So variable names starting with A-H and O-Z are typed REAL, and I-N are INTEGER. (Which makes GOD a REAL variable, and INT, luckily, an INTEGER, implicitly.) Remember, Fortran (actually back then it was FORTRAN) was designed by scientists and mathematicians, for whom i, j, k, l, m, and n were ALWAYS integers.
With IMPLICIT NONE, you are forced to explicitly type all variables, as you would in C, C++, or Java. So you could have INTEGER GOD, and REAL INT.