views:

236

answers:

8

I'm starting a new project in a language I'm less familiar with (FORTRAN) and am in the 'discovery' phase. Normally reading through and figuring out code is a fairly simple task, however, this code is rather large and not so structured. Are there any methods/tips/tricks/tools to mapping out 50k lines of rather dense code?

+1  A: 

I always find the starting point of execution, or where other code (that I'm not working on) calls the code I'm examining. Then I just start reading through it from there, following method calls as necessary to figure out what's going on.

Kaleb Brasee
In addition, if you can get the code in a 'test' type environment, I like to execute the code with known data and plenty of breakpoints/stdout type calls to follow through at least one execution path. Make sure you are isolated from anything real, you don't want to inadvertently alter production info!
Edward Leno
Thank you both, this is how I've been starting. Fortunately there a few test cases I can run.
ccook
+2  A: 

The debugger is your friend - if you have one for Fortran.

Before you go too much further I would familiarise yourself with the basic syntax of the language plus any foibles like assumptions about variable types from their names and positions of declarations etc. If you don't get that stuff then you are likely to get very lost even with a helpful debugger.

Remember as well that structure is sometimes language dependent. The code you are looking at may be badly structured for the languages you are used to but may be very well structured for Fortran, which has its own set of peculiarities. I think I am just saying have an open mind to start with otherwise you'll be carrying around the unnecessary predisposition that the code you are looking at is bad. It may be, but it may just be something you are not used to.

Best of luck. I rather liked Fortran when I programmed in it for a living about 20 years ago, and it is still the language of choice for some applications because of computation speeds on some platforms. Still quite a lot of it in academia.

Simon
Thank you, especially for the comment on structure. I definitely have a bias towards C#'s style.
ccook
... still quite a lot of it in the industry also ;)
ldigas
+2  A: 

When I coded Fortran (F77) thirty (yikes!) years ago, we had limited facilities to automatically flowchart an unknown codebase. It was ugly, and limited to the real-estate that a plotter bed could supply. As @Simon mentions, you can (and could back then, with some versions) also use a debugger.

Now, interactive exploration is easier. Additionally, you can experiment with IDEs. I have not personally tried it, as Fortran is no longer my key development language, but Photran is an Eclipse plug-in for Fortran, and appears to be under active development (last release was this month).

Don Wakefield
Thank you, I will give Photran a shot. I've been using vim and make so far, not too terribly bad. And wow, thirty years?
ccook
@ccook: Yeah, rounding up, but not *that* far (not far enough!) ;^)~
Don Wakefield
@Don I remember looking into one code (tsfoil2) and finding out it was written in 1975. Blew my mind that the code is still maintained after 35 years! So, what language did you start in, and what are you currently working in? With all the changes over the years, in the end does the coding still 'feel' the same??
ccook
I started with Pascal and assembly language, self-taught. Then I did a stint at school, using various languages, including Fortran IV, and ended up working for an employer who used Fortran 77. Honestly, it seemed old even then, and I wanted something more modern. I moved to another employer who was just starting to use C++ (CFront!) and have been doing it ever since. The current product I'm working on is twenty years old, but very vital, being enhanced, revamped, all the time. Does coding still 'feel' the same? For some values of 'feel', yeah. ;^)~
Don Wakefield
wow, very cool :)
ccook
+2  A: 

Hi

Take heart. One of Fortran's virtues is that it is very simple. Unless you find a code which has been programmed to take advantage of 'clever' tricks. I suggest that the first thing you do is to run your program through a compiler with syntax-checking and standards-compliance turned up to the max. Old (pre-Fortran 90) FORTRAN is notorious for the clever tricks that people used to get round the language's limitations. Some of the gotchas for programmers more familiar with modern languages:

-- common blocks; and other mechanisms for global state; especially bad are common blocks which are used to rename and redefine variables;

-- equivalences (horrid, but you might trip over them);

-- fixed-format source form;

-- use of CONTINUE statement, and the practice of having multiple loops ending at the same CONTINUE statement;

-- implicit declaration of variables (to sort these out, insert the line IMPLICIT NONE at immediately after the PROGRAM, MODULE, SUBROUTINE or FUNCTION statement everywhere they occur);

-- multiple entry points into sub-programs;

-- and a few others I'm so familiar with I can't recall them.

If these mean nothing to you, they soon will. And finally, you might want to look at Understand for Fortran. It costs, but it's very useful.

Regards

Mark

High Performance Mark
Thank you Mark, good points, and I do have some reading to do :)
ccook
If I might add, try to find out what was the compiler which was originally used. And then follow that "line" (for example, if it was Microsoft powerpoint, then the line of succesion is Digital fortran, compaq's visual, and now intel's). Although fortran is extremelly well standardized, some vendors used to introduce some of their extensions in the days (now some of those are part of the standard).
ldigas
ldigas
@Idigas thank you for the suggestions, I am going to take a look at those texts. Also, as I looked, fortunately it was only written against intel's fortran compiler.
ccook
@ccook - for intel's compiler ? Excellent, that makes it relatively new code (not meaning it was written in new fortran, but relatively recently written).
ldigas
+3  A: 

Is it Fortran IV (unlikely), 77, or 90/95? The language changed a lot with these revisions. Some of the gotchas that High-Performance Mark listed were common in Fortran IV but uncommon in 77, while others were still common in 77. The suggestion to check the code with maximum compiler warnings is excellent -- even use two compilers.

I'd start by diagramming the subroutine structure, and getting a high-level view of what they do. Then get a deeper understanding of the areas that need to be changed.

There are tools for analyzing and even improving Fortran code, e.g., http://www.polyhedron.com/pf-plusfort0html or http://www.crescentbaysoftware.com/vast%5F77to90.html.

M. S. B.
Interesting tools, but they are bit expensive. It looks to be coded as 77. I've been doing just that, diagramming the subroutines today. Thanks!
ccook
There is a free, reduced capability version of PlusFORT for "educational, academic and commercial evaluation use only" for Linux at http://www.polyhedron.com/pflinux0html
M. S. B.
I vouch for PlusFORT and FORCHECK. They are handy tools. Be sure to check the documentation for your compiler too: Sometimes the compilers have nifty reporting features.
jaredor
Excellent, thanks!
ccook
+1  A: 

Are you running on Linux or OpenSolaris? If so, the Sun Studio Fortran compiler is one of the best. And the Sun Studio IDE understands Fortran and comes with a debugger. http://developers.sun.com/sunstudio

rchrd
I'm using openSuSe atm, and have been using vim so far. Thanks for the tip, I will give this a shot!
ccook
+1  A: 

I'm sleepy so I'll be short :)

Start by grepping out (or whatever tool you use) program statements, subroutine statements, function statements and the like. Maybe modules and such if f90 is used. Draw a kind of diagram based on that, on which you'll see what calls what (what uses what subroutines, functions and the like).

After you've got a general view of the situation, get to data. Fortran requires everything to be declared at the start of program/subroutine ... so the first lines should give you declarations. After you've putted those in the diagram you just made, you should have a very clear situation by that time.

Now, the next step depends really on what you want to do with it.

ldigas
Mapping the data does seem to be the tricky bit :)
ccook
A: 

@ldigas: "Fortran requires everything to be declared at the start of program/subroutine ... "

No, unless IMPLICIT NONE appears at the start of a routine, Fortran uses implicit typing. So variable names starting with A-H and O-Z are typed REAL, and I-N are INTEGER. (Which makes GOD a REAL variable, and INT, luckily, an INTEGER, implicitly.) Remember, Fortran (actually back then it was FORTRAN) was designed by scientists and mathematicians, for whom i, j, k, l, m, and n were ALWAYS integers.

With IMPLICIT NONE, you are forced to explicitly type all variables, as you would in C, C++, or Java. So you could have INTEGER GOD, and REAL INT.

rchrd